Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using compiled version on Windows #98

Open
thewh1teagle opened this issue Aug 27, 2023 · 18 comments
Open

Using compiled version on Windows #98

thewh1teagle opened this issue Aug 27, 2023 · 18 comments

Comments

@thewh1teagle
Copy link

thewh1teagle commented Aug 27, 2023

Hi
I built sqlite-vss on Windows 11 x64 using cygwin64
I followed the instructions in #building-sqlite-vss-yourself

$ make loadable
cmake -B build; make -C build
-- Could NOT find MKL (missing: MKL_LIBRARIES)
-- Using the multi-header code from /cygdrive/c/Users/User/Documents/projects/vss/sqlite-vss/vendor/json/include/
-- Configuring done
-- Generating done
-- Build files have been written to: /cygdrive/c/Users/User/Documents/projects/vss/sqlite-vss/build
make[1]: Entering directory '/cygdrive/c/Users/User/Documents/projects/vss/sqlite-vss/build'
[  1%] Linking CXX shared library vector0.dll
...
[ 98%] Building CXX object vendor/faiss/faiss/CMakeFiles/faiss.dir/invlists/OnDiskInvertedLists.cpp.o
[100%] Linking CXX static library libfaiss.a
[100%] Built target faiss
make[1]: Leaving directory '/cygdrive/c/Users/User/Documents/projects/vss/sqlite-vss/build'
cp build/vector0.dll dist/debug/vector0.dll
cp build/vss0.dll dist/debug/vss0.dll

And as shown, I got vector0.dll and vss0.dll

Then I placed the dll files in the same folder of sqlite-vss/vendor/sqlite/.libs which has the compiled sqlite.exe
and tried to load the extension into sqlite

C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs>sqlite3.exe
SQLite version 3.40.1 2022-12-28 14:03:47
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .load vector0.dll
sqlite> .load vss0.dll
Error: The specified module could not be found.

sqlite>

Looks like it loads successfully vector0.dll but it fails to load vss0.dll

ldd output
C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs>ldd vector0.dll
      ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll (0x7ffd2ad10000)
      KERNEL32.DLL => /cygdrive/c/Windows/System32/KERNEL32.DLL (0x7ffd2a160000)
      KERNELBASE.dll => /cygdrive/c/Windows/System32/KERNELBASE.dll (0x7ffd283c0000)
      msvcrt.dll => /cygdrive/c/Windows/System32/msvcrt.dll (0x7ffd2a0b0000)
      cygwin1.dll => /usr/bin/cygwin1.dll (0x7ffceece0000)
      cygstdc++-6.dll => /usr/bin/cygstdc++-6.dll (0x3fec80000)
      cyggcc_s-seh-1.dll => /usr/bin/cyggcc_s-seh-1.dll (0x3ff870000)
      advapi32.dll => /cygdrive/c/Windows/System32/advapi32.dll (0x7ffd2ab00000)
      sechost.dll => /cygdrive/c/Windows/System32/sechost.dll (0x7ffd293e0000)
      RPCRT4.dll => /cygdrive/c/Windows/System32/RPCRT4.dll (0x7ffd29920000)
      CRYPTBASE.DLL => /cygdrive/c/Windows/SYSTEM32/CRYPTBASE.DLL (0x7ffd27860000)
      bcryptPrimitives.dll => /cygdrive/c/Windows/System32/bcryptPrimitives.dll (0x7ffd288c0000)

C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs>ldd vss0.dll
      ntdll.dll => /cygdrive/c/Windows/SYSTEM32/ntdll.dll (0x7ffd2ad10000)
      KERNEL32.DLL => /cygdrive/c/Windows/System32/KERNEL32.DLL (0x7ffd2a160000)
      KERNELBASE.dll => /cygdrive/c/Windows/System32/KERNELBASE.dll (0x7ffd283c0000)
      msvcrt.dll => /cygdrive/c/Windows/System32/msvcrt.dll (0x7ffd2a0b0000)
      cygwin1.dll => /usr/bin/cygwin1.dll (0x7ffceece0000)
      cygstdc++-6.dll => /usr/bin/cygstdc++-6.dll (0x3fec80000)
      cyggcc_s-seh-1.dll => /usr/bin/cyggcc_s-seh-1.dll (0x3ff870000)
      cygblas-0.dll => /cygdrive/c/Users/User/Documents/projects/vss/sqlite-vss/vendor/sqlite/.libs/cygblas-0.dll (0x3fe3d0000)
      cyggomp-1.dll => /usr/bin/cyggomp-1.dll (0x3fe310000)
      cyglapack-0.dll => /cygdrive/c/Users/User/Documents/projects/vss/sqlite-vss/vendor/sqlite/.libs/cyglapack-0.dll (0x3f8160000)
      cyggfortran-5.dll => /usr/bin/cyggfortran-5.dll (0x3f8890000)
      cygquadmath-0.dll => /usr/bin/cygquadmath-0.dll (0x3fc150000)
      advapi32.dll => /cygdrive/c/Windows/System32/advapi32.dll (0x7ffd2ab00000)
      sechost.dll => /cygdrive/c/Windows/System32/sechost.dll (0x7ffd293e0000)
      RPCRT4.dll => /cygdrive/c/Windows/System32/RPCRT4.dll (0x7ffd29920000)
      CRYPTBASE.DLL => /cygdrive/c/Windows/SYSTEM32/CRYPTBASE.DLL (0x7ffd27860000)
      bcryptPrimitives.dll => /cygdrive/c/Windows/System32/bcryptPrimitives.dll (0x7ffd288c0000)
@thewh1teagle
Copy link
Author

thewh1teagle commented Aug 28, 2023

update

I managed to load the extension into sqlite3 cli when compiled using make loadable-release. Also I had to add cyglapack-0.dll to the same folder.
I successfuly loaded the extensions in Python as well. but unfortunately when executing

db.execute("""
    CREATE VIRTUAL TABLE IF NOT EXISTS vss_post USING vss0(embeddings(3));
""")

it crash the program without any error.

@asg017
Copy link
Owner

asg017 commented Aug 28, 2023

Thanks for the detailed report and updates! You're the first person to report being able to compile sqlite-vss, so I'm very interested in getting this to work.

A few questions:

  1. After loading vector0.dll, does select vector_version() return a string?
  2. After loading vss0.dll, does select vss_version() return a string?
  3. Could you run select vss_distance_l1('[0.1, 0.1]', '[0.2, 0.2]') and see if that works?

@thewh1teagle
Copy link
Author

thewh1teagle commented Aug 28, 2023

  1. yes
  2. yes
  3. works

See it in action:

C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs>sqlite3.exe
SQLite version 3.40.1 2022-12-28 14:03:47
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> .open random.db
sqlite> .load vector0.dll
sqlite> select vector_version();
v0.1.2
sqlite> .load vss0.dll
sqlite> select vector_version();
v0.1.2
sqlite> select vss_distance_l1('[0.1, 0.1]', '[0.2, 0.2]');
0.200000002980232
sqlite>

When running the same from Python, the program crash here without error in

select vss_distance_l1('[0.1, 0.1]', '[0.2, 0.2]');

@asg017
Copy link
Owner

asg017 commented Aug 28, 2023

Does it throw an OperationalError, or just completely crash? Is there a segmentation fault, or any other messaging that gets logged out?

@thewh1teagle
Copy link
Author

No error, it just exit from python with status 1

Code here
import sqlite3

conn = sqlite3.connect('random.db')
conn.enable_load_extension(True)
conn.load_extension('vector0.dll')
conn.load_extension('vss0.dll')

cur = conn.cursor()
cur.execute('select vector_version();')
version = cur.fetchone()
print(version) # Working

cur.execute("select vss_distance_l1('[0.1, 0.1]', '[0.2, 0.2]');") # < ---- Crash
res = cur.fetchone()
print(res)

cur.close()
conn.close()

@asg017
Copy link
Owner

asg017 commented Aug 28, 2023

The fact that it only happens when executing functions that use Faiss's vector computations (ie fails on vss_distance_l1 and not vss_version) makes me thing that it's a dynamically library error. I'm guess that windows holds off on resolving + executing the cygblas-0.dll / cyggomp-1.dll / cyglapack-0.dll libraries until they're actually needed. It also might explain the spectacular no-error message failures - if it's a deep underlying dll error, then Python may not have a chance to catch it.

That's my guess at least - my knowledge with Windows is very limited. I'd say double check that those dll's exist and work correctly (probably with ldd ? ). I'd be curious to see if there's a sample Faiss C++ project you could compile + execute on your Windows machine, to see if it's a Faiss compilation error or a sqlite-vss specific error.

@thewh1teagle
Copy link
Author

I used the same faiss submodule in your repo. Just installed the necessary tools and libraries in cygwin setup and followed your instructions to build the library

@thewh1teagle
Copy link
Author

I added strace output when running the python script

Strace output
User@DESKTOP-HPEE9O3 /cygdrive/c/Users/User/Documents/projects/vss/sqlite-vss/vendor/sqlite/.libs
$ strace /cygdrive/c/Users/User/AppData/Local/Programs/Python/Python311/python.exe main.py
--- Process 17912 created
--- Process 17912 loaded C:\Windows\System32\ntdll.dll at 00007ff9bb4f0000
--- Process 17912 loaded C:\Windows\System32\kernel32.dll at 00007ff9ba020000
--- Process 17912 loaded C:\Windows\System32\KernelBase.dll at 00007ff9b8cf0000
--- Process 17912 thread 19168 created
--- Process 17912 thread 9276 created
--- Process 17912 loaded C:\Windows\System32\ucrtbase.dll at 00007ff9b8b10000
--- Process 17912 thread 3052 created
--- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\vcruntime140.dll at 00007ff9206e0000
--- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\python311.dll at 00007ff915b90000
--- Process 17912 loaded C:\Windows\System32\version.dll at 00007ff9af240000
--- Process 17912 loaded C:\Windows\System32\ws2_32.dll at 00007ff9b9fa0000
--- Process 17912 loaded C:\Windows\System32\msvcrt.dll at 00007ff9b92b0000
--- Process 17912 loaded C:\Windows\System32\rpcrt4.dll at 00007ff9ba740000
--- Process 17912 loaded C:\Windows\System32\advapi32.dll at 00007ff9ba9f0000
--- Process 17912 loaded C:\Windows\System32\sechost.dll at 00007ff9b94f0000
--- Process 17912 loaded C:\Windows\System32\bcrypt.dll at 00007ff9b8210000
--- Process 17912 loaded C:\Windows\System32\bcryptprimitives.dll at 00007ff9b8a90000
--- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\python3.dll at 000002db01780000
--- Process 17912 unloaded DLL at 000002db01780000
--- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\python3.dll at 000002db01780000
--- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\DLLs\_sqlite3.pyd at 00007ff9b31c0000
--- Process 17912 loaded C:\Users\User\AppData\Local\Programs\Python\Python311\DLLs\sqlite3.dll at 00007ff923b90000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\vector0.dll at 000000055d4d0000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cyggcc_s-seh-1.dll at 00000003ff870000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cygwin1.dll at 00007ff921030000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cygstdc++-6.dll at 00000003fec80000
  0       0 [main] python (17912) **********************************************
206     206 [main] python (17912) Program name: c:\Users\User\AppData\Local\Programs\Python\Python311\python.exe (windows pid 17912)
165     371 [main] python (17912) OS version:   Windows NT-10.0
132     503 [main] python (17912) **********************************************
--- Process 17912 loaded C:\Windows\System32\cryptbase.dll at 00007ff9b8040000
3365    3868 [main] python (17912) sigprocmask: 0 = sigprocmask (0, 0x0, 0x7FF9213093B0)
533    4401 [main] python (17912) open_shared: name shared.5, shared 0x1A4000000 (wanted 0x1A4000000), h 0x1A0, m 0, created 1
216    4617 [main] python (17912) shared_info::initialize: Installation root: <┬ג┬ה> key: <a32a5794382fce65>
167    4784 [main] python (17912) user_heap_info::init: heap base 0xA00000000, heap top 0xA00000000, heap size 0x20000000 (536870912)
169    4953 [main] python (17912) open_shared: name S-1-5-21-567552140-2017299312-2275771347-1001.1, shared 0x1A4010000 (wanted 0x1A4010000), h 0x1A4, m 1, created 1
158    5111 [main] python (17912) user_info::create: opening user shared for 'S-1-5-21-567552140-2017299312-2275771347-1001' at 0x1A4010000
171    5282 [main] python (17912) user_info::create: user shared version 0
175    5457 [main] python (17912) dll_crt0_0: finished dll_crt0_0 initialization
202    5659 [main] python (17912) time: 1693222268 = time(0x0)
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\vss0.dll at 00000005ca2f0000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cygblas-0.dll at 00000003f8ca0000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cyggomp-1.dll at 00000003fe310000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cyglapack-0.dll at 00000003f8160000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cyggfortran-5.dll at 00000003f8890000
--- Process 17912 loaded C:\Users\User\Documents\projects\vss\sqlite-vss\vendor\sqlite\.libs\cygquadmath-0.dll at 00000003fc150000
('v0.1.2',)
28871   34530 [main] python (17912) mmap: addr 0x0, len 34319826944, prot 0x3, flags 0x22, fd -1, off 0x0
233972  268502 [main] python (17912) mmap: 0x6FF802610000 = mmap()
--- Process 17912, exception c0000005 at 00007ff921031026
--- Process 17912 thread 19168 exited with status 0xc0000005
--- Process 17912 thread 9276 exited with status 0xc0000005
--- Process 17912 thread 3052 exited with status 0xc0000005
--- Process 17912 exited with status 0xc0000005
Segmentation fault

Also, when I run the script using Python version of cygwin64 it works:

cygwin64 python output
$ file /usr/bin/python3.9.exe
/usr/bin/python3.9.exe: PE32+ executable (console) x86-64, for MS Windows, 11 sections

$ /usr/bin/python3.9.exe main.py
('v0.1.2',)
(0.20000000298023224,)

@thewh1teagle
Copy link
Author

I tried faiss sample faiss c++ project and it compiled and run without errors:

code
#include <cmath>
#include <cstdio>
#include <cstdlib>
#include <random>

#include <sys/time.h>

#include <faiss/IndexFlat.h>
#include <faiss/IndexIVFFlat.h>
#include <faiss/IndexPQ.h>
#include <faiss/index_io.h>

double elapsed() {
  struct timeval tv;
  gettimeofday(&tv, nullptr);
  return tv.tv_sec + tv.tv_usec * 1e-6;
}

int main() {
  double t0 = elapsed();

  // dimension of the vectors to index
  int d = 128;

  // size of the database we plan to index
  size_t nb = 1000 * 1000;

  // make a set of nt training vectors in the unit cube
  // (could be the database)
  size_t nt = 100 * 1000;

  //---------------------------------------------------------------
  // Define the core quantizer
  // We choose a multiple inverted index for faster training with less data
  // and because it usually offers best accuracy/speed trade-offs
  //
  // We here assume that its lifespan of this coarse quantizer will cover the
  // lifespan of the inverted-file quantizer IndexIVFFlat below
  // With dynamic allocation, one may give the responsability to free the
  // quantizer to the inverted-file index (with attribute do_delete_quantizer)
  //
  // Note: a regular clustering algorithm would be defined as:
  //       faiss::IndexFlatL2 coarse_quantizer (d);
  //
  // Use nhash=2 subquantizers used to define the product coarse quantizer
  // Number of bits: we will have 2^nbits_coarse centroids per subquantizer
  //                 meaning (2^12)^nhash distinct inverted lists
  size_t nhash = 2;
  size_t nbits_subq = int(log2(nb + 1) / 2);     // good choice in general
  size_t ncentroids = 1 << (nhash * nbits_subq); // total # of centroids

  faiss::MultiIndexQuantizer coarse_quantizer(d, nhash, nbits_subq);

  printf("IMI (%ld,%ld): %ld virtual centroids (target: %ld base vectors)",
         nhash,
         nbits_subq,
         ncentroids,
         nb);

  // the coarse quantizer should not be dealloced before the index
  // 4 = nb of bytes per code (d must be a multiple of this)
  // 8 = nb of bits per sub-code (almost always 8)
  faiss::MetricType metric = faiss::METRIC_L2; // can be METRIC_INNER_PRODUCT
  faiss::IndexIVFFlat index(&coarse_quantizer, d, ncentroids, metric);
  index.quantizer_trains_alone = true;

  // define the number of probes. 2048 is for high-dim, overkilled in practice
  // Use 4-1024 depending on the trade-off speed accuracy that you want
  index.nprobe = 2048;

  std::mt19937 rng;
  std::uniform_real_distribution<> distrib;

  { // training
      printf("[%.3f s] Generating %ld vectors in %dD for training\n",
             elapsed() - t0,
             nt,
             d);

      std::vector<float> trainvecs(nt * d);
      for (size_t i = 0; i < nt * d; i++) {
          trainvecs[i] = distrib(rng);
      }

      printf("[%.3f s] Training the index\n", elapsed() - t0);
      index.verbose = true;
      index.train(nt, trainvecs.data());
  }

  size_t nq;
  std::vector<float> queries;

  { // populating the database
      printf("[%.3f s] Building a dataset of %ld vectors to index\n",
             elapsed() - t0,
             nb);

      std::vector<float> database(nb * d);
      for (size_t i = 0; i < nb * d; i++) {
          database[i] = distrib(rng);
      }

      printf("[%.3f s] Adding the vectors to the index\n", elapsed() - t0);

      index.add(nb, database.data());

      // remember a few elements from the database as queries
      int i0 = 1234;
      int i1 = 1244;

      nq = i1 - i0;
      queries.resize(nq * d);
      for (int i = i0; i < i1; i++) {
          for (int j = 0; j < d; j++) {
              queries[(i - i0) * d + j] = database[i * d + j];
          }
      }
  }

  { // searching the database
      int k = 5;
      printf("[%.3f s] Searching the %d nearest neighbors "
             "of %ld vectors in the index\n",
             elapsed() - t0,
             k,
             nq);

      std::vector<faiss::idx_t> nns(k * nq);
      std::vector<float> dis(k * nq);

      index.search(nq, queries.data(), k, dis.data(), nns.data());

      printf("[%.3f s] Query results (vector ids, then distances):\n",
             elapsed() - t0);

      for (int i = 0; i < nq; i++) {
          printf("query %2d: ", i);
          for (int j = 0; j < k; j++) {
              printf("%7ld ", nns[j + i * k]);
          }
          printf("\n     dis: ");
          for (int j = 0; j < k; j++) {
              printf("%7g ", dis[j + i * k]);
          }
          printf("\n");
      }
  }
  return 0;
}
Compile output
$ g++ main.cpp -I./sqlite-vss/vendor/faiss ./sqlite-vss/build_release/vendor/faiss/faiss/libfaiss.a  -fopenmp -lblas -llapack
$ ls
a.exe  main.cpp  sqlite-vss
Program output
$ ./a.exe
IMI (2,9): 262144 virtual centroids (target: 1000000 base vectors)[0.005 s] Generating 100000 vectors in 128D for training
[0.648 s] Training the index
Training level-1 quantizer
IVF quantizer trains alone...
Training IVF residual
IndexIVF: no residual training
[8.816 s] Building a dataset of 1000000 vectors to index
[15.241 s] Adding the vectors to the index
MultiIndexQuantizer::search: 0:32768 / 1000000
MultiIndexQuantizer::search: 32768:65536 / 1000000
MultiIndexQuantizer::search: 65536:98304 / 1000000
MultiIndexQuantizer::search: 98304:131072 / 1000000
MultiIndexQuantizer::search: 131072:163840 / 1000000
MultiIndexQuantizer::search: 163840:196608 / 1000000
MultiIndexQuantizer::search: 196608:229376 / 1000000
MultiIndexQuantizer::search: 229376:262144 / 1000000
MultiIndexQuantizer::search: 262144:294912 / 1000000
MultiIndexQuantizer::search: 294912:327680 / 1000000
MultiIndexQuantizer::search: 327680:360448 / 1000000
MultiIndexQuantizer::search: 360448:393216 / 1000000
MultiIndexQuantizer::search: 393216:425984 / 1000000
MultiIndexQuantizer::search: 425984:458752 / 1000000
MultiIndexQuantizer::search: 458752:491520 / 1000000
MultiIndexQuantizer::search: 491520:524288 / 1000000
MultiIndexQuantizer::search: 524288:557056 / 1000000
MultiIndexQuantizer::search: 557056:589824 / 1000000
MultiIndexQuantizer::search: 589824:622592 / 1000000
MultiIndexQuantizer::search: 622592:655360 / 1000000
MultiIndexQuantizer::search: 655360:688128 / 1000000
MultiIndexQuantizer::search: 688128:720896 / 1000000
MultiIndexQuantizer::search: 720896:753664 / 1000000
MultiIndexQuantizer::search: 753664:786432 / 1000000
MultiIndexQuantizer::search: 786432:819200 / 1000000
MultiIndexQuantizer::search: 819200:851968 / 1000000
MultiIndexQuantizer::search: 851968:884736 / 1000000
MultiIndexQuantizer::search: 884736:917504 / 1000000
MultiIndexQuantizer::search: 917504:950272 / 1000000
MultiIndexQuantizer::search: 950272:983040 / 1000000
MultiIndexQuantizer::search: 983040:1000000 / 1000000
IndexIVFFlat::add_core: added 1000000 / 1000000 vectors
[20.667 s] Searching the 5 nearest neighbors of 10 vectors in the index
[20.684 s] Query results (vector ids, then distances):
query  0:    1234   65776  815632  518751  168411
   dis:       0 13.2041 13.7313 13.9331 13.9852
query  1:    1235  235209   32981  339156  485140
   dis:       0 12.5675 13.2757 13.3526 13.3626
query  2:    1236   46384  393794  279123  803578
   dis:       0 13.2079  13.337 13.5685 13.5999
query  3:    1237  172600  435871  490284  116815
   dis:       0 12.9845 13.4125 13.4894 13.5741
query  4:    1238  185348  630264  685103  672356
   dis:       0 11.3711 12.2562 12.2871 12.2897
query  5:    1239  820990  306204    3096  549432
   dis:       0 12.4804 13.2535 13.5721 13.5853
query  6:    1240  122701  687644  802575  350632
   dis:       0 13.7758 13.9611 14.1327 14.2155
query  7:    1241  985126  686744  336958  926803
   dis:       0 13.2923  13.636 13.7428 14.0614
query  8:    1242  880999  488401  181311  712631
   dis:       0 13.1505 13.4343  13.488 13.6331
query  9:    1243  829029  233144  108428  402759
   dis:       0 12.5892 12.7777 12.8653 13.1447

$ echo $?
0

@asg017
Copy link
Owner

asg017 commented Aug 28, 2023

So to summarize, on your windows machine using cygwin64:

  • Compiling a sample C++ Faiss program works as expected
  • Running sqlite-vss with the cygwin64 version of Python works as expected
  • Running sqlite-vss with regular Python does now work and crashes
  • Running sqlite-vss from the sqlite3 CLI does not work and still crashes

Is that right?

@thewh1teagle
Copy link
Author

Everything correct, except for the last one -
using sqlite-vss from sqlite3 CLI works, both from cygwin environemnt or just from cmd

@asg017
Copy link
Owner

asg017 commented Aug 29, 2023

Cool - so is there anything actionable you'd like from this issue then? My guess is that since sqlite-vss was built with cygwin64, it requires cygwin64 applications to load the extension.

I'll probably try compiling sqlite-vss with cygwin64 on a github actions runner, but its been very difficult in the past

@thewh1teagle
Copy link
Author

thewh1teagle commented Aug 29, 2023

Currenly I want to figure out why do I get segfault when running from Python / Nodejs that is not part of cygwin
I'm not sure how to debug it.

It will not be usable if we can't use it with regular Python which is not of cygwin

@thewh1teagle
Copy link
Author

I managed to compile faiss on Windows in msys2 environment.
msys2 works pretty well, I think it's suitable for doing it pretty easily in Github actions as well.
facebookresearch/faiss#3067

@leonsmiers
Copy link

leonsmiers commented Jan 7, 2024

Hello,
I try to use VSS in combination with SQLite on Windows.I like the approach making VSS part of the query search.
Did you make any progress with the Windows install?
I'm struggling now with settings in the Makefile and the CMakelist.txt.

Thanks in advance,
Léon

@ma-chengyuan
Copy link

ma-chengyuan commented Mar 17, 2024

Here's a way that worked for me, based on @thewh1teagle 's approach. I haven't tested it in-depth, but the sqlite3 cli can load the extensions and select vss_distance_l1('[0.1, 0.1]', '[0.2, 0.2]') works.

  1. Install MSYS2
  2. Open UCRT terminal
  3. Run
# see https://github.com/facebookresearch/faiss/issues/3067#issuecomment-1873007384
pacman --needed -S $MINGW_PACKAGE_PREFIX-{toolchain,cmake,make,swig,autotools,lapack} git
git clone https://github.com/asg017/sqlite-vss.git && cd sqlite-vss
# see https://github.com/asg017/sqlite-vss/blob/main/docs.md
./vendor/get_sqlite.sh
cd vendor/sqlite
./configure && make
cd ../../
  1. Apply the following patch to vendor/faiss:
diff --git a/faiss/CMakeLists.txt b/faiss/CMakeLists.txt
index 16eb9e9c..940ba03f 100644
--- a/faiss/CMakeLists.txt
+++ b/faiss/CMakeLists.txt
@@ -214,8 +214,8 @@ add_library(faiss_avx2 ${FAISS_SRC})
 if(NOT FAISS_OPT_LEVEL STREQUAL "avx2")
   set_target_properties(faiss_avx2 PROPERTIES EXCLUDE_FROM_ALL TRUE)
 endif()
-if(NOT WIN32)
-  target_compile_options(faiss_avx2 PRIVATE $<$<COMPILE_LANGUAGE:CXX>:-mavx2 -mfma -mf16c -mpopcnt>)
+if(NOT MSVC)
+  target_compile_options(faiss_avx2 PRIVATE $<$<COMPILE_LANGUAGE:CXX>:-mavx2 -mfma -mf16c -mpopcnt -fpermissive>)
 else()
   # MSVC enables FMA with /arch:AVX2; no separate flags for F16C, POPCNT
   # Ref. FMA (under /arch:AVX2): https://docs.microsoft.com/en-us/cpp/build/reference/arch-x64
diff --git a/faiss/impl/platform_macros.h b/faiss/impl/platform_macros.h
index 9cec8260..44293e3e 100644
--- a/faiss/impl/platform_macros.h
+++ b/faiss/impl/platform_macros.h
@@ -83,6 +83,17 @@ inline int __builtin_clzll(uint64_t x) {
 #endif

 #else
+
+/*******************************************************
+ * Windows MinGW
+ *******************************************************/
+#ifdef _WIN32
+
+#define posix_memalign(p, a, s) \
+    (((*(p)) = _aligned_malloc((s), (a))), *(p) ? 0 : errno)
+#endif
+
+
 /*******************************************************
  * Linux and OSX
  *******************************************************/
diff --git a/faiss/invlists/InvertedListsIOHook.cpp b/faiss/invlists/InvertedListsIOHook.cpp
index 0081c4f9..2c3a6006 100644
--- a/faiss/invlists/InvertedListsIOHook.cpp
+++ b/faiss/invlists/InvertedListsIOHook.cpp
@@ -13,9 +13,9 @@

 #include <faiss/invlists/BlockInvertedLists.h>

-#ifndef _MSC_VER
+#ifndef _WIN32
 #include <faiss/invlists/OnDiskInvertedLists.h>
-#endif // !_MSC_VER
+#endif // !_WIN32

 namespace faiss {

@@ -33,7 +33,7 @@ namespace {
 /// std::vector that deletes its contents
 struct IOHookTable : std::vector<InvertedListsIOHook*> {
     IOHookTable() {
-#ifndef _MSC_VER
+#ifndef _WIN32
         push_back(new OnDiskInvertedListsIOHook());
 #endif
         push_back(new BlockInvertedListsIOHook());
  1. Build with
cmake -B build-release . -G "MinGW Makefiles" -DCMAKE_BUILD_TYPE=Release
cmake --build build-release -- -j<number of cores here>
# copy dlls for use outside of MSYS2
cp /ucrt64/bin/{libgcc_s_seh-1.dll,libwinpthread-1.dll,libblas.dll,libgomp-1.dll,liblapack.dll,libgfortran-5.dll,libquadmath-0.dll,libstdc++-6.dll} ./build-release/
  1. Done! Just remember to copy all dlls under build-release on deployment.

@wldbest
Copy link

wldbest commented Mar 31, 2024

Hi, @ma-chengyuan could you please share the dll file?
I would like to use the windows version of sqlite-vss, but the make process seems too complicated. If you can share it, I would like to try it to see if it works in my environment.

@bqhuyy
Copy link

bqhuyy commented Apr 8, 2024

@ma-chengyuan hi, I follow your instruction. The dlls only work with sqlite tool only. The built-in version sqlite3 (installed using .msi file on Windows) or node sqlite3 cannot load that dll as extension.
Here is the message error: The specified module could not be found

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants