Show HN: Textcase: A Python Library for Text Case Conversion
(github.com)69 points by zobweyt a day ago
69 points by zobweyt a day ago
Thank you for the kind words! I'm glad you appreciate the effort put into covering all those edge cases.
It sounds like you had quite the adventure with text casing on your project. I'm happy this library can save you some time and hassle. Looking forward to see what can be built with it!
Many edge-cases can be found with regards to casing! Like title-case characters.
https://www.unicode.org/versions/Unicode16.0.0/core-spec/cha...
I might be jaded, but I think having libraries for such simple use cases leads to the inevitable `left-pad` situation.
When I say simple use cases I mean that since you probably don't need all of these functions at once that it would be easier to copy the code you need if you don't feel comfortable writing it instead of adding yetanotherlibrary to your dependency tree.
I understand your perspective, and it's a valid concern. However, this library is designed to support not only simple use cases but also more advanced scenarios, providing a comprehensive solution for various needs. Additionally, it has zero dependencies, which helps keep your project lightweight. This way, you can benefit from the library's features without adding unnecessary complexity to your dependency tree. Thank you for sharing your thoughts!
> A feature complete Python text case conversion library
I suspect you mean "featureful", "full-featured" or similar[1]—"feature complete" means that you're not going to add any more features.
[1] https://english.stackexchange.com/questions/393517/what-do-y...
Thanks!
It does not support non-English title casing. From the documentation:
> It also works non-ascii characters. However, no inferences on the language itself is made. For instance, the digraph ij in Dutch will not be capitalized, because it is represented as two distinct Unicode characters. However, æ would be capitalized
I was talking about the specific rules that are in place for title capitalization. As you can see in my example the uppercase letters seem randomly placed for a title, but they are indeed correct. For German too there are issues where capitalization has a meaning on the word itself. That kind of things.
It looks like your library does not support it, which is understandable, it is a huge problem to tackle, but I just wanted to be sure.
Thank you for the clarification! I understand that title capitalization can be quite complex, especially with specific rules in languages like German where capitalization can change the meaning of a word.
I guess handling these nuances falls under the broader categories of internationalization (i18n) and localization (l10n).
> It does not support non-English title casing
Perhaps document that clearly—it's an important restriction that the library assumes English-language strings. ("no inferences on the language itself is made" isn't quite true since the language is inferred to be English, or to at least follow English-compatible rules for casing)
Thank you for the feedback!
I appreciate your suggestion regarding the name, but unfortunately this name was already taken, so "textcase" was chosen.
I also have ideas for adding dictionary key conversion and other features in the future that will handle more than just strings. In addition, you can use this library to convert cases of Iterable[str] using textcase.pattern
Looks brilliant!
My only suggestion is here:
> It also ignores any leading, trailing, or duplicate delimiters:
from textcase import case, convert
print(convert("IOStream", case.SNAKE)) # io_stream
print(convert("myJSONParser", case.SNAKE)) # my_json_parser
print(convert("__weird--var _name-", case.SNAKE)) # weird_var_name
In the case of a conversion target that has delimiters (snake, kebab) it might be nice to have an alternative option to preserve such features but normalise them to the target delimiteri.e.
print(convert("__weird--var _name-", case.SNAKE, preserve=True)) # __weird__var__name
This should be implemented in editors.
It also looks to be nice in exploratory data analysis:
df = pd.read_csv(f)
df.columns = map(convert, df.columns, case.snake)
My favorite part of this library is that it seems to have zero dependencies!
Python packages seem to often rope in a surprising number of dependencies for relatively limited libraries.
I can easily imagine pulling this package into my work: thank you for keeping the requirements to a minimum!
Definitely something to be championed, although I suspect this is a matter of perspective. I find Python packages to have refreshingly few dependencies compared to packages in the JS ecosystem, although compared to the Swift ecosystem which I’m somewhat familiar with, they do tend to have a few more.
I appreciate your perspective! It's interesting to consider how the built-in libraries of a language can influence its ecosystem. Python does have a rich standard library that often reduces the need for external dependencies. In contrast, JavaScript's ecosystem has evolved around web development, where modularity and flexibility are prioritized, leading to a proliferation of packages.
Thanks for the suggestion!
Right now, there's no such GH badge. Since the project will always have zero dependencies, I think we can simply use a static badge like this:
HAppY ApRiL FoOLs!
If only this comment supported case conversion..
In any case congrats on shipping!
Now available from AUR: https://aur.archlinux.org/packages/python-textcase-git
I am really impressed by how thoroughly this library thinks through all the applications and edge cases of text casing.
On a recent project I spent about an hour trying to do something similar (and far less sophisticated) before I realized it was a problem I had no desire in really solving, so I backed out all my changes and just went with string.capitalize(), even though it didn’t really do what I was looking for. Looking forward to using this instead!