- Overview
- What are embeddings
- Technologies
- Installation
- Usage
- Pre-seeded Data and Sample Queries
- Debugging on Visual Studio
- OpenIA API Callings Samples
- OpenIA API Pricing
The room-mapping-ai is a proof-of-concept (POC) web api that allows users to find the best possible matches of hotel room names.
The most critical ingredient in this application is the combination of the OpenAI embedding SDK and the PostgreSQL database with the vectors extension.
When a new hotel room name is provided, the API retrieves the embedding vector from the OpenAI API and stores it in the PostgreSQL database.
It then uses native search functionality in PostgreSQL to find potential matches based on distance vector similarity.
Both the Web Api and the Postgres database run in containers, there is no need to install anything apart from docker.
The model used for calculating the embedding on OpenIA side is:
AdaTextEmbedding
OpenAI’s text embeddings measure the relatedness of text strings. Embeddings are commonly used for:
Search (where results are ranked by relevance to a query string)
An embedding is a vector (list) of floating point numbers. The distance between two vectors measures their relatedness.
Small distances suggest high relatedness and large distances suggest low relatedness.
- Docker
-
Clone the repository: https://github.com/brunonuzzi/room-mapping-ai
-
Create your open ai key: https://platform.openai.com/account/api-keys
-
Navigate to the project directory:
cd room-mapping-ai
-
Replace room-mapping-ai/appsettings.json with your own open api key.
"OpenAI": {
"ApiKey": "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"
}
-
Build the Docker containers using Docker:
docker-compose build
-
Run the Docker Compose:
docker-compose up
-
Navigate to http://localhost:5000/swagger/index.html
To stop and remove containers:
docker-compose down --remove-orphans
To use the room mapping ai API, send a POST request with the hotel room name as the request body
curl -X 'POST' \
'http://localhost:5000/api/RoomMapping/GetMostSimilarRooms?roomName=Serenity%20Luxury' \
-H 'accept: text/plain' \
-d ''
The API will respond with a list of potential matches:
[
{
"id": 1,
"hotelName": "Mallorca Rocks",
"roomName": "Serenity Suite",
"vectorDistance": 0.06337405572419863
},
{
"id": 10,
"hotelName": "Mallorca Rocks",
"roomName": "Serenity Double Room",
"vectorDistance": 0.08145154924929454
},
{
"id": 5,
"hotelName": "Mallorca Rocks",
"roomName": "Royal Executive Suite",
"vectorDistance": 0.15970261195384094
},
{
"id": 4,
"hotelName": "Mallorca Rocks",
"roomName": "Urban Escape Penthouse",
"vectorDistance": 0.16005893649310143
},
{
"id": 2,
"hotelName": "Mallorca Rocks",
"roomName": "Ocean View Deluxe",
"vectorDistance": 0.16867594238009598
}
]
Open your preferred PostgreSQL client (psql, pgAdmin, or DBeaver) and create a new connection:
- Host :
localhost
- Port :
5432
- Username:
roommapping-user
- Password:
roommapping-password
- Database name:
room-mapping
The room-mapping-ai comes with pre-seeded data to help you get started quickly. This data includes a collection of hotel room names with their respective embedding vectors. To explore the pre-seeded data and better understand the application's capabilities, you can execute sample queries directly on the PostgreSQL database.
For instance, you can use the following query to find the top 5 most similar hotel rooms to the room id = 1 (Serenity Suite), based on their embedding vectors:
SELECT Id
,HotelName
,RoomName
,CreatedDate
,'Serenity Suite' SearchedRoomName
,embedding <=> (select embedding from rooms where id = 1)
as vector_distance
FROM public.Rooms
ORDER BY vector_distance limit 5
The <=> operator in the previous query mean
cosine distance
, you cand find other operators here
The database will generate the following results
ID | Hotel Name | Room Name | CreatedDate | SearchedRoomName | vector_distance |
---|---|---|---|---|---|
1 | Mallorca Rocks | Serenity Suite | 2023-03-28 | Serenity Suite | 0 |
13 | Mallorca Rocks | Serenity Luxury | 2023-03-28 | Serenity Suite | 0.06337405572419863 |
10 | Mallorca Rocks | Serenity Double Room | 2023-03-28 | Serenity Suite | 0.06572621072063112 |
5 | Mallorca Rocks | Royal Executive Suite | 2023-03-28 | Serenity Suite | 0.12737531341984398 |
3 | Mallorca Rocks | Garden Terrace Room | 2023-03-28 | Serenity Suite | 0.1471648253749488 |
For a better understanding of how seeding is done,you can check the seed.sql file
To the debug the wep api on Visual Studio , first make sure to remove all the existing containers related to the web api in order to avoid porting conflicts.
Once the containers are removed , just open the room-mapping-ai.sln
on Visual Studio and
launch the docker compose.
The web api should launch and the debugger will be attached.
Code snippet to direct call the OpenAI API and generate an embedding vector for a hotel room name
curl https://api.openai.com/v1/embeddings \
-H "Content-Type: application/json" \
-H "Authorization: Bearer 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" \
-d '{
"input": "Ocean View Deluxe",
"model": "text-embedding-ada-002"
}'
The expected result:
{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [
0.0041769226,
0.0067320974,
...
]
}
],
"model": "text-embedding-ada-002-v2",
"usage": {
"prompt_tokens": 3,
"total_tokens": 3
}
}
The room-mapping-ai API utilizes the OpenAI API for generating embedding vectors. The price right now is $0.0004/1K tokens
An average string like "Serenity Double Room" has 3 tokens. You can check prices over here