openai/openai-python

Public

mirrored from https://github.com/openai/openai-pythonAvailable

Watch0 Fork0 Star0

Code Commits Issues Pull requests Actions Insights Security

logankilpatrick-patch-1

Find a branch or tag

Branches

logankilpatrick-patch-1

Clone

HTTPS

Download ZIP

chatml.md

98lines · modecode

Raw Download

Latest commit unavailable.

unknown

1	`(This document is a preview of the underlying format consumed by`
2	`ChatGPT models. As a developer, you can use our [higher-level`
3	`API](https://platform.openai.com/docs/guides/chat) and won't need to`
4	`interact directly with this format today — but expect to have the`
5	`option in the future!)`
6
7	`Note: This document showcases the underlying format consumed by`
8	`ChatGPT models when GPT-3.5-Turbo was released. It may not be up to date with`
9	`our current models and should not be relied on for correctness but rather used`
10	`to build a mental model of what is happening behind the scenes.`
11
12	`Traditionally, GPT models consumed unstructured text. ChatGPT models`
13	`instead expect a structured format, called Chat Markup Language`
14	`(ChatML for short).`
15	`ChatML documents consist of a sequence of messages. Each message`
16	`contains a header (which today consists of who said it, but in the`
17	`future will contain other metadata) and contents (which today is a`
18	`text payload, but in the future will contain other datatypes).`
19	`We are still evolving ChatML, but the current version (ChatML v0) can`
20	`be represented with our upcoming "list of dicts" JSON format as`
21	`follows:`
22	```
23	`[`
24	`{"token": "<\|im_start\|>"},`
25	`"system\nYou are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.\nKnowledge cutoff: 2021-09-01\nCurrent date: 2023-03-01",`
26	`{"token": "<\|im_end\|>"}, "\n", {"token": "<\|im_start\|>"},`
27	`"user\nHow are you",`
28	`{"token": "<\|im_end\|>"}, "\n", {"token": "<\|im_start\|>"},`
29	`"assistant\nI am doing well!",`
30	`{"token": "<\|im_end\|>"}, "\n", {"token": "<\|im_start\|>"},`
31	`"user\nHow are you now?",`
32	`{"token": "<\|im_end\|>"}, "\n"`
33	`]`
34	```
35	`You could also represent it in the classic "unsafe raw string"`
36	`format. Note this format inherently allows injections from user input`
37	`containing special-token syntax, similar to a SQL injections:`
38	```
39	`<\|im_start\|>system`
40	`You are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible.`
41	`Knowledge cutoff: 2021-09-01`
42	`Current date: 2023-03-01<\|im_end\|>`
43	`<\|im_start\|>user`
44	`How are you<\|im_end\|>`
45	`<\|im_start\|>assistant`
46	`I am doing well!<\|im_end\|>`
47	`<\|im_start\|>user`
48	`How are you now?<\|im_end\|>`
49	```
50	`## Non-chat use-cases`
51	`ChatML can be applied to classic GPT use-cases that are not`
52	`traditionally thought of as chat. For example, instruction following`
53	`(where a user requests for the AI to complete an instruction) can be`
54	`implemented as a ChatML query like the following:`
55	```
56	`[`
57	`{"token": "<\|im_start\|>"},`
58	`"user\nList off some good ideas:",`
59	`{"token": "<\|im_end\|>"}, "\n", {"token": "<\|im_start\|>"},`
60	`"assistant"`
61	`]`
62	```
63	`We do not currently allow autocompleting of partial messages,`
64	```
65	`[`
66	`{"token": "<\|im_start\|>"},`
67	`"system\nPlease autocomplete the user's message."`
68	`{"token": "<\|im_end\|>"}, "\n", {"token": "<\|im_start\|>"},`
69	`"user\nThis morning I decided to eat a giant"`
70	`]`
71	```
72	`Note that ChatML makes explicit to the model the source of each piece`
73	`of text, and particularly shows the boundary between human and AI`
74	`text. This gives an opportunity to mitigate and eventually solve`
75	`injections, as the model can tell which instructions come from the`
76	`developer, the user, or its own input.`
77	`## Few-shot prompting`
78	`In general, we recommend adding few-shot examples using separate`
79	`system` messages with a `name` field of `example_user` or
80	`example_assistant`. For example, here is a 1-shot prompt:
81	```
82	`<\|im_start\|>system`
83	`Translate from English to French`
84	`<\|im_end\|>`
85	`<\|im_start\|>system name=example_user`
86	`How are you?`
87	`<\|im_end\|>`
88	`<\|im_start\|>system name=example_assistant`
89	`Comment allez-vous?`
90	`<\|im_end\|>`
91	`<\|im_start\|>user`
92	`{{user input here}}<\|im_end\|>`
93	```
94	If adding instructions in the `system` message doesn't work, you can
95	also try putting them into a `user` message. (In the near future, we
96	`will train our models to be much more steerable via the system`
97	`message. But to date, we have trained only on a few system messages,`
98	`so the models pay much more attention to user examples.)`
99

openai/openai-python

Branches

Tags

Clone