Comment by zahlman

Comment by zahlman a day ago

4 replies

> then they confidently send me something that breaks on testing it, then half a dozen more iterations, then "python2 is doing the wrong thing or, 'I could get this working but it isn't worth the effort'"

It almost works as-is in my testing. (By the way, there's a typo in the usage message.) Here is my test process:

  #!/usr/bin/env python
  import random, sys, time
  
  
  def out(b):
      # ASCII 0..7 for the second digit of the color code in the escape sequence
      color = random.randint(48, 55)
      sys.stdout.buffer.write(bytes([27, 91, 51, color, 109, b]))
      sys.stdout.flush()
  
  
  for i in range(32, 256):
      out(i)
      time.sleep(random.random()/5)
  
  
  while True:
      out(random.randint(32, 255))
      time.sleep(0.1)
I suppressed random output of C0 control characters to avoid messing up my terminal, but I added a test that basic ANSI escape sequences can work through this.

(My initial version of this didn't flush the output, which mistakenly lead me to try a bunch of unnecessary things in the main script.)

After fixing the `print` calls, the only thing I was forced to change (although I would do the code differently overall) is the output step:

  # sys.stdout.write(out.encode("UTF-8"))
  sys.stdout.buffer.write(out.encode("UTF-8"))
  sys.stdout.flush()
I've tried this out locally (in gnome-terminal) with no issue. (I also compared to the original; I have a local build of 2.7 and adjusted the shebang appropriately.)

There's a warning that `bufsize=1` no longer actually means a byte buffer of size 1 for reading (instead it's magically interpreted as a request for line buffering), but this didn't cause a failure when I tried it. (And setting the size to e.g. `2` didn't break things, either.)

I also tried having my test process read from standard input; the handling of ctrl-C and ctrl-D seems to be a bit different (and in general, setting up a Python process to read unbuffered bytes from stdin isn't the most fun thing), but I generally couldn't find any issues here, either. Which is to say, the problems there are in the test process, not in `ibmfilter`. The input is still forwarded to, and readable from, the test process via the `Popen` object. And any problems of this sort are definitely still fixable, as demonstrated by the fact that `curses` is still in the standard library.

Of course, keys in the `special` mapping need to be defined as bytes literals now. Although that could trivially be adapted if you insist.

capitainenemo 11 hours ago

Sorry, I'm not a python guy, do you have a script you'd like me to run against python3? Just toss me a pastebin link, and ideally the version of python3 to run, since half the python3 scripts on my system seem to require a different version of python3 from the other half and a variety of isolated sets of python libs in virtual environments (heck, pip even warns you not to try installing libs globally so everyone can use same set these days). I'd rather not try to follow a set of suggestions and then be told I did it wrong.

As for typo, yep. But then, I've left this script essentially untouched for a couple of decades since I was given it.

  • zahlman 10 hours ago

    > do you have a script you'd like me to run against python3? Just toss me a pastebin link, and ideally the version of python3 to run

    Here's a diff:

      diff --git a/ibmfilter b/ibmfilter
      index 245d32c..2633335 100755
      --- a/ibmfilter
      +++ b/ibmfilter
      @@ -1,6 +1,5 @@
      -#!/usr/bin/python2 -tt
      -# vim:set fileencoding=utf-8
      - 
      +#!/usr/bin/python3
      +
       from subprocess import *
       import sys 
       import os, select
      @@ -10,8 +9,8 @@ special = {
       }
        
       if len(sys.argv) < 2:
      -    print "usage: ibmfilter [command]"
      -    print "Runs command in a subshell and translates its output from ibm473 codepage to UTF-8."
      +    print("usage: ibmfilter [command]")
      +    print("Runs command in a subshell and translates its output from ibm473 codepage to UTF-8.")
           sys.exit(0)
        
       handle = Popen(sys.argv[1:], stdout=PIPE, bufsize=1)
      @@ -26,8 +25,10 @@ while buf != '':
               os.kill(handle.pid)
               os.system('reset')
               raise Exception("Timed out while waiting for stdout to be writeable...")
      -    sys.stdout.write(out.encode("UTF-8"))
      - 
      +    sys.stdout.buffer.write(out.encode("UTF-8"))
      +    sys.stdout.flush()
      +
           buf = handle.stdout.read(1)
        
       handle.wait()
    
    I already have tested it and it works fine as far as I can tell on every version since at least 3.3 through 3.13 inclusive. There's really nothing version specific here, except the warning I mentioned which is introduced in 3.8. If you encounter a problem, some more sophisticated diagnostics would be needed, and honestly I'm not actually sure where to start with that. (Although I'm mildly impressed that you still have access to a 2.7 interpreter in /usr/bin without breaking anything else.)

    If you want to add overrides, you must use bytes literals for the keys. That looks like:

      b'\xff': 'X'
    
    > (heck, pip even warns you not to try installing libs globally so everyone can use same set these days)

    Some Python programs have mutually incompatible dependencies, and you can't really have two versions of the same dependency loaded in the same runtime. This has always been a problem; you're just looking at the current iteration of pip trying to cooperate with Linux distros to help you not break your system as a result.

    "Using the same set" is not actually desirable for development.

    • capitainenemo 7 hours ago

      So, the patch failed with both my original file and the pastebin one - perhaps due to indentation of Hacker News, so I manually applied since it did seem pretty straightforward - honestly given how short the file was, it would have taken up the same amount of space here as the diff I think, but hopefully I applied it correctly. Manual copy/paste in python always worries me w/ the significant white-space thing (one of our friends accidentally DoS'd our server with his first python script due to that, but that was back when mixing tabs and spaces didn't throw an error by default), but I probably did it right.

      And with that out of the way. This one seems to mostly work!

      So python3 did not significantly change handling this sort of byte stream and while Mercurial folks might well have had their own woes, I have no idea what the issues were in all those prior attempts with this file.

      ... that said, it does do one odd thing (following is output on launching):

          /usr/lib/python3.12/subprocess.py:1016: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
              self.stdout = io.open(c2pread, 'rb', bufsize)
      
      And yet, I can't spot any issues in gameplay, yet, caused by this, so I'm inclined to let it pass? But, it does make me wonder if later on, I might hit issues...

      At least for now, I'm going to tentatively say it seems fine. Hm. You know what. Let me try with some more obvious things that might fail if the buffer size is wrong.

      So. Now I'm wondering why, given how relatively minor this change is (aside from the odd error message, and the typical python3 changes just one slightly modified line and one inserted line), why did so many pythonistas have so much difficulty over the many years I asked about this? I mean, I only formed my opinion that maybe there was a problem with python3 byte/string handling to just how many attempts there were... Were they trying to do things in a more idiomatic python3 fashion? Did the python3 APIs change? Does the error hint at something more concerning? Well, whatever. Clearly it's (mostly) fine now. And my carefully tweaked nethack profile is safe if python2 is removed without needing to make my own stream filter. Yay! Thanks!

      ... further updates.. ok there are a few issues.

      1) the warning

      2) there's an odd ghost flicker that jumps around the nethack level as if a cursor is appearing - does not happen in the python2 one.

      3) on quitting it no longer it no longer exits gracefully and I have to ctrl-c the script.

      4) It is much slower to render. the python2 one draws a screen almost instantly for most uses (although still a bit slower than not filtered, at least on this computer, for things that change a lot, like video). This one ripples down - that might explain the ghost flickering in ② and might be related to the buffer warning. This becomes much more noticeable with BBSes although it is usually fine in nethack. You can see the difference on a simpler testcase without setting up a BBS account by streaming a bit more data at once say by running: ibmfilter curl ascii.live/nyan

      So, clearly not perfect but.. eh. functional? Still far better than prior attempts, and at least it mostly works with nethack.

      • zahlman an hour ago

        > perhaps due to indentation of Hacker News, so I manually applied since it did seem pretty straightforward

        Yes, that would be exactly why. You can use e.g. `sed` to remove leading whitespace from each line (I used it to add the leading whitespace for posting).

        > ... that said, it does do one odd thing (following is output on launching):

        Yes, that's the warning I mentioned. The original code requests to use a buffer size of 1, which is no longer supported (it now means to use line buffering).

        > It is much slower to render.

        Avoiding line buffering (by requiring a buffer size of 2 or more) might fix that. Actually, it might be a good idea to use a significantly larger buffer, so that e.g. an entire ANSI colour code can be read all at once.

        The other issues are, I'm pretty sure, because of other things that changed in how `subprocess` works. Fixing things at this level would indeed require quite a bit more hacking around with the low-level terminal APIs.

        > I mean, I only formed my opinion that maybe there was a problem with python3 byte/string handling to just how many attempts there were... Were they trying to do things in a more idiomatic python3 fashion? Did the python3 APIs change? Does the error hint at something more concerning?

        Most likely, other attempts either a) didn't understand what the original code was doing in precise enough detail, or b) didn't know how to send binary data to standard output properly (Python 3 defaults to opening standard output as a text stream).

        All of that said: I think that nowadays you should just be able to get a build of NetHack that just outputs UTF-8 characters directly; failing that, you can use the `locale` command to tell your terminal to expect cp437 data.