🐘

PostgreSQL / SQL and sqlalchemy things

Anh-Thi Dinh
draft
⚠️
This is a quick & dirty draft, for me only!
Check the supported dimensions of a vector column
1SELECT DISTINCT vector_dims(embedding) FROM docembeddings;
2
3-- with count
4SELECT 
5    vector_dims(embedding) as current_dimensions,
6    COUNT(*) as count
7FROM docembeddings 
8WHERE embedding IS NOT NULL
9GROUP BY vector_dims(embedding);
Convert a column into using only one dimension
1-- Create backup table first
2CREATE TABLE docembeddings_backup AS 
3SELECT * FROM docembeddings;
4
5-- Remove the old column
6ALTER TABLE docembeddings DROP COLUMN embedding;
7
8-- Add new column with fixed 1536 dimensions
9ALTER TABLE docembeddings ADD COLUMN embedding vector(1536);
10
11-- Restore from backup (only 1536-dimension embeddings)
12UPDATE docembeddings 
13SET embedding = b.embedding
14FROM docembeddings_backup b
15WHERE docembeddings.id = b.id 
16    AND vector_dims(b.embedding) = 1536;
17    
18-- Remove backup table
19DROP TABLE docembeddings_backup;
Get all rows when sourcetype isn’t given (by :source). Otherwise, get only rows where sourcetype is given
1SELECT *
2FROM your_table
3WHERE (:source IS NULL OR sourcetype = :source)
Quick ref to SQLAlchemy
Nếu DELETE mà doc ko tồn tại thì vẫn ok. Khi ấy result.rowcount ra 0 (cái này là row affected)
1result = conn.execute(text("DELETE FROM document WHERE id=:id;"), {"id": d['id']})
2result.rowcount # number of rows deleted
Could not locate column in row for column 'status’
Có nghĩa là lúc SELECT .... thiếu đi “status”. Thêm nó vào là ok.
After removing a row, the foreign IDs in other places will be automatically deleted (the tables related to the ID in this table will also be deleted)
sqlalchemy.exc.ResourceClosedError: This result object is closed
You may only call first() once on a result set. after that, it's closed!
1# wrong (call .first() twice)
2for i in list(user.first().roles):
3    user.first().roles.remove(i)
4    
5# right
6first_user = user.first()
7for i in list(first_user.roles):
8    first_user.roles.remove(i)
Convert a result to a list of dict
1result = conn.execute(text("SELECT fromdocid, text FROM table WHERE id IN (:doc0, :doc1);"), {"doc0": 18, "doc1": 33})
2# convert result to a list of dict
3result = [{"docId": row.fromdocid, "content": row.text} for row in result]
4print(result)
Add a new column is_faq to the document table. If the create_date is after "2024-06-04" and the text column contains "Question: <any>, Answer: <any>", set is_faq to True; otherwise, set it to False.
1UPDATE document
2SET is_faq = 
3  CASE 
4    WHEN text ~ '(?s)Question:.*?Answer:.*?' THEN TRUE
5    ELSE FALSE
6  END
7WHERE create_time > '2024-06-04';
Only get column text if includeContent=true. If onlyFaqs=true, get all rows, otherwise, get only rows where is_faq=TRUE
1SELECT
2    column1,
3    column2,
4    -- Conditionally include the "text" column
5    CASE
6        WHEN :includeContent THEN text
7        ELSE NULL
8    END AS text
9FROM your_table
10WHERE (is_faq = TRUE OR NOT :onlyFaqs);