openai/tiktoken

Public

mirrored fromhttps://github.com/openai/tiktokenAvailable

CodeCommitsIssuesPull requestsActionsInsightsSecurity
05e66e8db7ef220d3c0b1aafbee5af289345684b

Branches

Tags

  • No tags available.
0Branches0Tags
Go to file
Add file
Code

Clone

HTTPS

Download ZIP

CHANGELOG.md

98lines · modecode

1# Changelog
2
3This is the changelog for the open source version of tiktoken.
4
5## [v0.8.0]
6
7- Support for `o1-` and `chatgpt-4o-` models
8- Build wheels for Python 3.13
9- Add possessive quantifiers to limit backtracking in regular expressions, thanks to @l0rinc!
10- Provide a better error message and type for invalid token decode
11- Permit tuples in type hints
12- Better error message for passing invalid input to `get_encoding`
13- Better error messages during plugin loading
14- Add a `__version__` attribute
15- Update versions of `pyo3`, `regex`, `fancy-regex`
16- Drop support for Python 3.8
17
18## [v0.7.0]
19
20- Support for `gpt-4o`
21- Performance improvements
22
23## [v0.6.0]
24
25- Optimise regular expressions for a 20% performance improvement, thanks to @paplorinc!
26- Add `text-embedding-3-*` models to `encoding_for_model`
27- Check content hash for downloaded files
28- Allow pickling `Encoding` objects. Registered `Encoding` will be pickled by reference
29- Workaround PyO3 bug for frozenset conversion
30
31Thank you to @paplorinc, @mdwelsh, @Praneet460!
32
33## [v0.5.2]
34
35- Build wheels for Python 3.12
36- Update version of PyO3 to allow multiple imports
37- Avoid permission errors when using default cache logic
38
39## [v0.5.1]
40
41- Add `encoding_name_for_model`, undo some renames to variables that are implementation details
42
43## [v0.5.0]
44
45- Add `tiktoken._educational` submodule to better document how byte pair encoding works
46- Ensure `encoding_for_model` knows about several new models
47- Add `decode_with_offets`
48- Better error for failures with the plugin mechanism
49- Make more tests public
50- Update versions of dependencies
51
52## [v0.4.0]
53
54- Add `decode_batch` and `decode_bytes_batch`
55- Improve error messages and handling
56
57## [v0.3.3]
58
59- `tiktoken` will now make a best effort attempt to replace surrogate pairs with the corresponding
60 Unicode character and will replace lone surrogates with the Unicode replacement character.
61
62## [v0.3.2]
63
64- Add encoding for GPT-4
65
66## [v0.3.1]
67
68- Build aarch64 wheels
69- Make `blobfile` an optional dependency
70
71Thank you to @messense for the environment variable that makes cargo not OOM under emulation!
72
73## [v0.3.0]
74
75- Improve performance by 5-20%; thank you to @nistath!
76- Add `gpt-3.5-turbo` models to `encoding_for_model`
77- Add prefix matching to `encoding_for_model` to better support future model versions
78- Fix a bug in the README instructions on extending tiktoken
79- Update the set of available encodings
80- Add packaging metadata
81
82## [v0.2.0]
83
84- Add `tiktoken.encoding_for_model` to get the encoding for a specific model
85- Improve portability of caching logic
86
87Thank you to @fritzo, @arvid220u, @khanhvu207, @henriktorget for various small corrections
88
89## [v0.1.2]
90
91- Avoid use of `blobfile` for public files
92- Add support for Python 3.8
93- Add py.typed
94- Improve the public tests
95
96## [v0.1.1]
97
98- Initial release
99