Skip to content

Latest commit

 

History

History
181 lines (119 loc) · 10.8 KB

learn-chatgpt.md

File metadata and controls

181 lines (119 loc) · 10.8 KB

Learn ChatGPT

TODO: Continue reading docs from here ..


ChatGPT was released publicy on 30 November 2022.

Quick Links:

Courses I found on Coursera:

  • Prompt Engineering for ChatGPT ~ Vanderbilt University (USA) comparable to Cornell University): Click here
  • Generative AI with Large Language Models: Click here
  • Build AI Apps with ChatGPT, Dall-E, and GPT-4: Click here

Docs Links:

chat gpt can now hear and speak - Official Docs

NOTE: Only for plus and enterprice users.

Click here

When I asked to generate 10,000 and 2,000 words article to ChatGPT

Query wordcounter.net platform.openai.com/tokenizer
10,000 words article 998 words 6,438 characters Tokens: 1,285, Characters: 6484
2,000 words article 1,128 words 7,292 characters Tokens: 1,491, Characters 7358

When I asked to write counting upto 2_500, 5_000 and 10_000.

  • It denies for 5_000 and 10_000
  • It writes for upto 2500 but I need to keep pressing the Continue regenrating after very 500 counting approximately.
Date: 7 Sep, 2023

Quickstart tutorial - OpenAI end notes

Source: Click here

image

Completions

Correct

image

Incorrect:

image

List of gpt-3.5-turbo models from api - /models

gpt-3.5-turbo-16k-0613
gpt-3.5-turbo
gpt-3.5-turbo-16k
gpt-3.5-turbo-0613
gpt-3.5-turbo-0301

General Terminologies

Source: Official Quickstart Guide from OpenAI: Click here

  • The completions endpoint is the core of our API and provides a simple interface that’s extremely flexible and powerful. You input some text as a prompt, and the API will return a text completion that attempts to match whatever instructions or context you gave it.
  • Designing your prompt is essentially how you “program” the model.
  • Prompt design isn’t the only tool you have at your disposal. You can also control completions by adjusting your settings. One of the most important settings is called temperature.
    • You may have noticed that if you submitted the same prompt multiple times in the examples above, the model would always return identical or very similar completions. This is because your temperature was set to 0.
    • Try re-submitting the same prompt a few times with temperature set to 1.
    • See what happened? When temperature is above 0, submitting the same prompt results in different completions each time.
    • Remember that the model predicts which text is most likely to follow the text preceding it. Temperature is a value between 0 and 1 that essentially lets you control how confident the model should be when making these predictions. Lowering temperature means it will take fewer risks, and completions will be more accurate and deterministic. Increasing temperature will result in more diverse completions.
    • For your pet name generator, you probably want to be able to generate a lot of name ideas. A moderate temperature of 0.6 should work well.

Source of below image: Click here

image

DEEP DIVE - Understanding tokens and probabilities

Source: Official Quickstart Guide from OpenAI: Click here

image

Pricing - 1/2 Most cost effective model

image

Pricing - 2/2

Source Pricing: Click here

  • Cost of 1k input tokens + cost of 1k output tokens = (0.0015 + 0.002) = 0.0035 $ (2.90 Rs.)
  • Article from Open AI - What are tokens and how to count them? : Click here

image

Image - 1/2 - Free Trial gives you 5$ (Date: 5 September, 2023).

image

Image - 2/2 - Free Trial gives you 5$ (Date: 5 September, 2023).

image

❤️ ❤️ ❤️ Personalized model training ❤️ ❤️ ❤️ :

image

Rate Limits

Source: Click here

image

Tokenizer

Source: Official Tokenizer Page from ChatGPT: platform.openai.com/tokenizer

The GPT family of models process text using tokens, which are common sequences of characters found in text. The models understand the statistical relationships between these tokens, and excel at producing the next token in a sequence of tokens.

You can use the tool below to understand how a piece of text would be tokenized by the API, and the total count of tokens in that piece of text.

A helpful rule of thumb is that one token generally corresponds to ~4 characters of text for common English text. This translates to roughly ¾ of a word (so 100 tokens ~= 75 words).

If you need a programmatic interface for tokenizing text, check out our tiktoken package for Python. For JavaScript, the gpt-3-encoder package for node.js works for most GPT-3 models.

Compartible models for each endpoint

Source - Docs: Click here

image

Zero Retention

Source - Docs: Click here

  • To help identify abuse, API data may be retained for up to 30 days, after which it will be deleted (unless otherwise required by law). For trusted customers with sensitive applications, zero data retention may be available. With zero data retention, request and response bodies are not persisted to any logging mechanism and exist only in memory in order to serve the request.
  • Note that this data policy does not apply to OpenAI's non-API consumer services like ChatGPT or DALL·E Labs.

image