Comment by zahlman
> a tiny python2 script that I have used for decades for conversion of character streams in terminals has proved to be repeated unportable to python3.
Show me.
> a tiny python2 script that I have used for decades for conversion of character streams in terminals has proved to be repeated unportable to python3.
Show me.
> then they confidently send me something that breaks on testing it, then half a dozen more iterations, then "python2 is doing the wrong thing or, 'I could get this working but it isn't worth the effort'"
It almost works as-is in my testing. (By the way, there's a typo in the usage message.) Here is my test process:
#!/usr/bin/env python
import random, sys, time
def out(b):
# ASCII 0..7 for the second digit of the color code in the escape sequence
color = random.randint(48, 55)
sys.stdout.buffer.write(bytes([27, 91, 51, color, 109, b]))
sys.stdout.flush()
for i in range(32, 256):
out(i)
time.sleep(random.random()/5)
while True:
out(random.randint(32, 255))
time.sleep(0.1)
I suppressed random output of C0 control characters to avoid messing up my terminal, but I added a test that basic ANSI escape sequences can work through this.(My initial version of this didn't flush the output, which mistakenly lead me to try a bunch of unnecessary things in the main script.)
After fixing the `print` calls, the only thing I was forced to change (although I would do the code differently overall) is the output step:
# sys.stdout.write(out.encode("UTF-8"))
sys.stdout.buffer.write(out.encode("UTF-8"))
sys.stdout.flush()
I've tried this out locally (in gnome-terminal) with no issue. (I also compared to the original; I have a local build of 2.7 and adjusted the shebang appropriately.)There's a warning that `bufsize=1` no longer actually means a byte buffer of size 1 for reading (instead it's magically interpreted as a request for line buffering), but this didn't cause a failure when I tried it. (And setting the size to e.g. `2` didn't break things, either.)
I also tried having my test process read from standard input; the handling of ctrl-C and ctrl-D seems to be a bit different (and in general, setting up a Python process to read unbuffered bytes from stdin isn't the most fun thing), but I generally couldn't find any issues here, either. Which is to say, the problems there are in the test process, not in `ibmfilter`. The input is still forwarded to, and readable from, the test process via the `Popen` object. And any problems of this sort are definitely still fixable, as demonstrated by the fact that `curses` is still in the standard library.
Of course, keys in the `special` mapping need to be defined as bytes literals now. Although that could trivially be adapted if you insist.
Sorry, I'm not a python guy, do you have a script you'd like me to run against python3? Just toss me a pastebin link, and ideally the version of python3 to run, since half the python3 scripts on my system seem to require a different version of python3 from the other half and a variety of isolated sets of python libs in virtual environments (heck, pip even warns you not to try installing libs globally so everyone can use same set these days). I'd rather not try to follow a set of suggestions and then be told I did it wrong.
As for typo, yep. But then, I've left this script essentially untouched for a couple of decades since I was given it.
> do you have a script you'd like me to run against python3? Just toss me a pastebin link, and ideally the version of python3 to run
Here's a diff:
diff --git a/ibmfilter b/ibmfilter
index 245d32c..2633335 100755
--- a/ibmfilter
+++ b/ibmfilter
@@ -1,6 +1,5 @@
-#!/usr/bin/python2 -tt
-# vim:set fileencoding=utf-8
-
+#!/usr/bin/python3
+
from subprocess import *
import sys
import os, select
@@ -10,8 +9,8 @@ special = {
}
if len(sys.argv) < 2:
- print "usage: ibmfilter [command]"
- print "Runs command in a subshell and translates its output from ibm473 codepage to UTF-8."
+ print("usage: ibmfilter [command]")
+ print("Runs command in a subshell and translates its output from ibm473 codepage to UTF-8.")
sys.exit(0)
handle = Popen(sys.argv[1:], stdout=PIPE, bufsize=1)
@@ -26,8 +25,10 @@ while buf != '':
os.kill(handle.pid)
os.system('reset')
raise Exception("Timed out while waiting for stdout to be writeable...")
- sys.stdout.write(out.encode("UTF-8"))
-
+ sys.stdout.buffer.write(out.encode("UTF-8"))
+ sys.stdout.flush()
+
buf = handle.stdout.read(1)
handle.wait()
I already have tested it and it works fine as far as I can tell on every version since at least 3.3 through 3.13 inclusive. There's really nothing version specific here, except the warning I mentioned which is introduced in 3.8. If you encounter a problem, some more sophisticated diagnostics would be needed, and honestly I'm not actually sure where to start with that. (Although I'm mildly impressed that you still have access to a 2.7 interpreter in /usr/bin without breaking anything else.)If you want to add overrides, you must use bytes literals for the keys. That looks like:
b'\xff': 'X'
> (heck, pip even warns you not to try installing libs globally so everyone can use same set these days)Some Python programs have mutually incompatible dependencies, and you can't really have two versions of the same dependency loaded in the same runtime. This has always been a problem; you're just looking at the current iteration of pip trying to cooperate with Linux distros to help you not break your system as a result.
"Using the same set" is not actually desirable for development.
So, the patch failed with both my original file and the pastebin one - perhaps due to indentation of Hacker News, so I manually applied since it did seem pretty straightforward - honestly given how short the file was, it would have taken up the same amount of space here as the diff I think, but hopefully I applied it correctly. Manual copy/paste in python always worries me w/ the significant white-space thing (one of our friends accidentally DoS'd our server with his first python script due to that, but that was back when mixing tabs and spaces didn't throw an error by default), but I probably did it right.
And with that out of the way. This one seems to mostly work!
So python3 did not significantly change handling this sort of byte stream and while Mercurial folks might well have had their own woes, I have no idea what the issues were in all those prior attempts with this file.
... that said, it does do one odd thing (following is output on launching):
/usr/lib/python3.12/subprocess.py:1016: RuntimeWarning: line buffering (buffering=1) isn't supported in binary mode, the default buffer size will be used
self.stdout = io.open(c2pread, 'rb', bufsize)
And yet, I can't spot any issues in gameplay, yet, caused by this, so I'm inclined to let it pass? But, it does make me wonder if later on, I might hit issues...At least for now, I'm going to tentatively say it seems fine. Hm. You know what. Let me try with some more obvious things that might fail if the buffer size is wrong.
So. Now I'm wondering why, given how relatively minor this change is (aside from the odd error message, and the typical python3 changes just one slightly modified line and one inserted line), why did so many pythonistas have so much difficulty over the many years I asked about this? I mean, I only formed my opinion that maybe there was a problem with python3 byte/string handling to just how many attempts there were... Were they trying to do things in a more idiomatic python3 fashion? Did the python3 APIs change? Does the error hint at something more concerning? Well, whatever. Clearly it's (mostly) fine now. And my carefully tweaked nethack profile is safe if python2 is removed without needing to make my own stream filter. Yay! Thanks!
... further updates.. ok there are a few issues.
1) the warning
2) there's an odd ghost flicker that jumps around the nethack level as if a cursor is appearing - does not happen in the python2 one.
3) on quitting it no longer it no longer exits gracefully and I have to ctrl-c the script.
4) It is much slower to render. the python2 one draws a screen almost instantly for most uses (although still a bit slower than not filtered, at least on this computer, for things that change a lot, like video). This one ripples down - that might explain the ghost flickering in ② and might be related to the buffer warning. This becomes much more noticeable with BBSes although it is usually fine in nethack. You can see the difference on a simpler testcase without setting up a BBS account by streaming a bit more data at once say by running: ibmfilter curl ascii.live/nyan
So, clearly not perfect but.. eh. functional? Still far better than prior attempts, and at least it mostly works with nethack.
Heh. It always starts this way... then they confidently send me something that breaks on testing it, then half a dozen more iterations, then "python2 is doing the wrong thing" or, "I could get this working but it isn't worth the effort" but sure, let's do this one more time. Could be they were all missing something obvious - wouldn't know, I avoid python personally, apart from when necessary like with LLM glue. https://pastebin.com/j4Lzb5q1
This is a script created by someone on #nethack a long time ago. It works great with other things as well like old BBS games. It was intended to transparently rewrite single byte encodings to multibyte with an optional conversion array.