openai/openai-python

Public

mirrored fromhttps://github.com/openai/openai-pythonAvailable

Watch0 Fork0 Star0

Code Commits Issues Pull requests Actions Insights Security

v0.27.8

Find a branch or tag

Branches

v0.27.8

Clone

HTTPS

Download ZIP

chatml.md

93lines · modecode

Raw Download

Latest commit unavailable.

unknown

1	`(This document is a preview of the underlying format consumed by`
2	`ChatGPT models. As a developer, you can use our [higher-level`
3	`API](https://platform.openai.com/docs/guides/chat) and won't need to`
4	`interact directly with this format today — but expect to have the`
5	`option in the future!)`
6
7	`Traditionally, GPT models consumed unstructured text. ChatGPT models`
8	`instead expect a structured format, called Chat Markup Language`
9	`(ChatML for short).`
10	`ChatML documents consist of a sequence of messages. Each message`
11	`contains a header (which today consists of who said it, but in the`
12	`future will contain other metadata) and contents (which today is a`
13	`text payload, but in the future will contain other datatypes).`
14	`We are still evolving ChatML, but the current version (ChatML v0) can`
15	`be represented with our upcoming "list of dicts" JSON format as`
16	`follows:`
17	```
18	`[`
19	`{"token": "<\|im_start\|>"},`
20	`"system\nYou are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.\nKnowledge cutoff: 2021-09-01\nCurrent date: 2023-03-01",`
21	`{"token": "<\|im_end\|>"}, "\n", {"token": "<\|im_start\|>"},`
22	`"user\nHow are you",`
23	`{"token": "<\|im_end\|>"}, "\n", {"token": "<\|im_start\|>"},`
24	`"assistant\nI am doing well!",`
25	`{"token": "<\|im_end\|>"}, "\n", {"token": "<\|im_start\|>"},`
26	`"user\nHow are you now?",`
27	`{"token": "<\|im_end\|>"}, "\n"`
28	`]`
29	```
30	`You could also represent it in the classic "unsafe raw string"`
31	`format. However, this format inherently allows injections from user`
32	`input containing special-token syntax, similar to SQL injections:`
33	```
34	`<\|im_start\|>system`
35	`You are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.`
36	`Knowledge cutoff: 2021-09-01`
37	`Current date: 2023-03-01<\|im_end\|>`
38	`<\|im_start\|>user`
39	`How are you<\|im_end\|>`
40	`<\|im_start\|>assistant`
41	`I am doing well!<\|im_end\|>`
42	`<\|im_start\|>user`
43	`How are you now?<\|im_end\|>`
44	```
45	`## Non-chat use-cases`
46	`ChatML can be applied to classic GPT use-cases that are not`
47	`traditionally thought of as chat. For example, instruction following`
48	`(where a user requests for the AI to complete an instruction) can be`
49	`implemented as a ChatML query like the following:`
50	```
51	`[`
52	`{"token": "<\|im_start\|>"},`
53	`"user\nList off some good ideas:",`
54	`{"token": "<\|im_end\|>"}, "\n", {"token": "<\|im_start\|>"},`
55	`"assistant"`
56	`]`
57	```
58	`We do not currently allow autocompleting of partial messages,`
59	```
60	`[`
61	`{"token": "<\|im_start\|>"},`
62	`"system\nPlease autocomplete the user's message.",`
63	`{"token": "<\|im_end\|>"}, "\n", {"token": "<\|im_start\|>"},`
64	`"user\nThis morning I decided to eat a giant"`
65	`]`
66	```
67	`Note that ChatML makes explicit to the model the source of each piece`
68	`of text, and particularly shows the boundary between human and AI`
69	`text. This gives an opportunity to mitigate and eventually solve`
70	`injections, as the model can tell which instructions come from the`
71	`developer, the user, or its own input.`
72	`## Few-shot prompting`
73	`In general, we recommend adding few-shot examples using separate`
74	`system` messages with a `name` field of `example_user` or
75	`example_assistant`. For example, here is a 1-shot prompt:
76	```
77	`<\|im_start\|>system`
78	`Translate from English to French`
79	`<\|im_end\|>`
80	`<\|im_start\|>system name=example_user`
81	`How are you?`
82	`<\|im_end\|>`
83	`<\|im_start\|>system name=example_assistant`
84	`Comment allez-vous?`
85	`<\|im_end\|>`
86	`<\|im_start\|>user`
87	`{{user input here}}<\|im_end\|>`
88	```
89	If adding instructions in the `system` message doesn't work, you can
90	also try putting them into a `user` message. (In the near future, we
91	`will train our models to be much more steerable via the system`
92	`message. But to date, we have trained only on a few system messages,`
93	`so the models pay much more attention to user examples.)`
94

openai/openai-python

Branches

Tags

Clone