openai/openai-python

Public

mirrored fromhttps://github.com/openai/openai-pythonAvailable

Watch0 Fork0 Star0

Code Commits Issues Pull requests Actions Insights Security

v0.28.1

Find a branch or tag

Branches

v0.28.1

Clone

HTTPS

Download ZIP

chatml.md

96lines · modecode

Raw Download

Latest commit unavailable.

unknown

1	`> [!IMPORTANT]`
2	`> This page is not currently maintained and is intended to provide general insight into the ChatML format, not current up-to-date information.`
3
4	`(This document is a preview of the underlying format consumed by`
5	`GPT models. As a developer, you can use our [higher-level`
6	`API](https://platform.openai.com/docs/guides/chat) and won't need to`
7	`interact directly with this format today — but expect to have the`
8	`option in the future!)`
9
10	`Traditionally, GPT models consumed unstructured text. ChatGPT models`
11	`instead expect a structured format, called Chat Markup Language`
12	`(ChatML for short).`
13	`ChatML documents consist of a sequence of messages. Each message`
14	`contains a header (which today consists of who said it, but in the`
15	`future will contain other metadata) and contents (which today is a`
16	`text payload, but in the future will contain other datatypes).`
17	`We are still evolving ChatML, but the current version (ChatML v0) can`
18	`be represented with our upcoming "list of dicts" JSON format as`
19	`follows:`
20	```
21	`[`
22	`{"token": "<\|im_start\|>"},`
23	`"system\nYou are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.\nKnowledge cutoff: 2021-09-01\nCurrent date: 2023-03-01",`
24	`{"token": "<\|im_end\|>"}, "\n", {"token": "<\|im_start\|>"},`
25	`"user\nHow are you",`
26	`{"token": "<\|im_end\|>"}, "\n", {"token": "<\|im_start\|>"},`
27	`"assistant\nI am doing well!",`
28	`{"token": "<\|im_end\|>"}, "\n", {"token": "<\|im_start\|>"},`
29	`"user\nHow are you now?",`
30	`{"token": "<\|im_end\|>"}, "\n"`
31	`]`
32	```
33	`You could also represent it in the classic "unsafe raw string"`
34	`format. However, this format inherently allows injections from user`
35	`input containing special-token syntax, similar to SQL injections:`
36	```
37	`<\|im_start\|>system`
38	`You are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.`
39	`Knowledge cutoff: 2021-09-01`
40	`Current date: 2023-03-01<\|im_end\|>`
41	`<\|im_start\|>user`
42	`How are you<\|im_end\|>`
43	`<\|im_start\|>assistant`
44	`I am doing well!<\|im_end\|>`
45	`<\|im_start\|>user`
46	`How are you now?<\|im_end\|>`
47	```
48	`## Non-chat use-cases`
49	`ChatML can be applied to classic GPT use-cases that are not`
50	`traditionally thought of as chat. For example, instruction following`
51	`(where a user requests for the AI to complete an instruction) can be`
52	`implemented as a ChatML query like the following:`
53	```
54	`[`
55	`{"token": "<\|im_start\|>"},`
56	`"user\nList off some good ideas:",`
57	`{"token": "<\|im_end\|>"}, "\n", {"token": "<\|im_start\|>"},`
58	`"assistant"`
59	`]`
60	```
61	`We do not currently allow autocompleting of partial messages,`
62	```
63	`[`
64	`{"token": "<\|im_start\|>"},`
65	`"system\nPlease autocomplete the user's message.",`
66	`{"token": "<\|im_end\|>"}, "\n", {"token": "<\|im_start\|>"},`
67	`"user\nThis morning I decided to eat a giant"`
68	`]`
69	```
70	`Note that ChatML makes explicit to the model the source of each piece`
71	`of text, and particularly shows the boundary between human and AI`
72	`text. This gives an opportunity to mitigate and eventually solve`
73	`injections, as the model can tell which instructions come from the`
74	`developer, the user, or its own input.`
75	`## Few-shot prompting`
76	`In general, we recommend adding few-shot examples using separate`
77	`system` messages with a `name` field of `example_user` or
78	`example_assistant`. For example, here is a 1-shot prompt:
79	```
80	`<\|im_start\|>system`
81	`Translate from English to French`
82	`<\|im_end\|>`
83	`<\|im_start\|>system name=example_user`
84	`How are you?`
85	`<\|im_end\|>`
86	`<\|im_start\|>system name=example_assistant`
87	`Comment allez-vous?`
88	`<\|im_end\|>`
89	`<\|im_start\|>user`
90	`{{user input here}}<\|im_end\|>`
91	```
92	If adding instructions in the `system` message doesn't work, you can
93	also try putting them into a `user` message. (In the near future, we
94	`will train our models to be much more steerable via the system`
95	`message. But to date, we have trained only on a few system messages,`
96	`so the models pay much more attention to user examples.)`
97

openai/openai-python

Branches

Tags

Clone