Skip to content

ENH: support bytearray and unicode (#3)#4

Open
westurner wants to merge 4 commits into
kosqx:masterfrom
westurner:feature/bytearray_and_unicode
Open

ENH: support bytearray and unicode (#3)#4
westurner wants to merge 4 commits into
kosqx:masterfrom
westurner:feature/bytearray_and_unicode

Conversation

@westurner

@westurner westurner commented Nov 13, 2016

Copy link
Copy Markdown

@notpushkin

Copy link
Copy Markdown

Fails on Python 3, NameError: name 'unicode' is not defined :(

@westurner

Copy link
Copy Markdown
Author

Whoops. I think that should only be for Python 2 because in Python 3 all
str are unicode.

On Wednesday, November 16, 2016, Alexander Pushkov notifications@github.com
wrote:

Fails on Python 3, NameError: name 'unicode' is not defined :(


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#4 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AADGy77ug9XCD5PdA461pQNqM3AtSJzmks5q-vfTgaJpZM4Kwxlh
.

@notpushkin

Copy link
Copy Markdown

The error actually happens in the elif t is unicode line. We can safely substitute unicode for str (in py3k) here (as unicode_to_bytes accepts str in python 3), so I suggest this simple fix: Songbee@0bb114b

@westurner

Copy link
Copy Markdown
Author

Good call @iamale

@westurner

Copy link
Copy Markdown
Author
  • TST: What should I add to TEST_DATA to test for unicode strings?
  • ENH,TST: IDK Python C-API well enough to port this from _pure.py to _fast.c

@westurner

Copy link
Copy Markdown
Author

... TIL "All character string values are UTF-8 encoded." https://wiki.theory.org/BitTorrentSpecification#Metainfo_File_Structure

Comment thread tests/test_bencode.py
(b'd3:bar4:spam3:fooi42ee', {b'bar': b'spam', b'foo': 42}),
(b'd1:ai1e1:bi2e1:ci3ee', {b'a': 1, b'b': 2, b'c': 3}),
(b'd1:a1:be', {b'a': b'b'}),
(b'3:\x00\x01\x02', bytearray([0, 1, 2])),

@westurner westurner Nov 17, 2016

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this correct?

@notpushkin

Copy link
Copy Markdown

Also I think it would be nice to decode unicode strings to unicode/str as well (although it might be a bit trickier).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants