Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File Browser Doesn't Show Files #6

Open
nategoethel opened this issue Mar 17, 2024 · 19 comments
Open

File Browser Doesn't Show Files #6

nategoethel opened this issue Mar 17, 2024 · 19 comments

Comments

@nategoethel
Copy link

          I can't seem to select an individual file in that folder though.

Originally posted by @nigelp in #4 (comment)

@nategoethel
Copy link
Author

Clicking the Upload Documents button opens the Windows file explorer. However, this view does not actually show the files available within a directory. For example, this folder has a PDF in it:

image

But if you look at it in Dot:

image

@alexpinel
Copy link
Owner

Hi! Right now the app only supports selecting folders, so if you select the folder in which the document is it should load all the contents. I know its a bit counterintuintive and am working on that!
Also some people have reported issues running the LLM on windows, please let me know if you have the same problem

@nigelp
Copy link

nigelp commented Mar 17, 2024

Yeah, basically it's not working on Windows at all as far as I can tell.

@alexpinel
Copy link
Owner

hmmmm thanks for letting me know, still havent found the source of the issue... I'll pull the windows version from the website for now and will keep you updated.

The fact the documents take a while to load would indicate that the python backend is working and is connected to the rest of the app, maybe the issue is related to the python dependencies needed to run the LLM or the LLM file itself, if it is the former I'd imagine it probably is llama-cpp and its CUDA configuration. If you have installed Dot in your desktop could you please run the following commands (adjust the path for your system):

& 'C:\Users\Desktop\Dot\resources\llm\python\python.exe' -m pip uninstall llama-cpp-python
& 'C:\Users\Desktop\Dot\resources\llm\python\python.exe' -m pip install llama-cpp-python

This would reinstall the llama-cpp library used to run the LLM and remove the GPU accelerations settings, this would of course make the program slower but if it works it would indicate where the problem is. Please let me know if you can try this and if it works! :)

@nigelp
Copy link

nigelp commented Mar 17, 2024

Hmm...uninstalled OK, but reinstall seemed to barf.
dotPython

@alexpinel
Copy link
Owner

According to the llama-cpp-python documentation here it appears that Visual Studio is required to install the library, this is because a C compiler is required to run llama-cpp

@nigelp
Copy link

nigelp commented Mar 17, 2024

OK, well it slammed my CPU for a while but worked this time. Produced a summary of a 1 page docx in around 50 seconds. But I'm running an i7 with an 8GB RTX4060, so it probably should be a little faster? But at least it's working now. :)

@alexpinel
Copy link
Owner

That's amazing news! Thank you so much for the help, now I at least now where the problem comes from! If you want to try to set it up for use with GPU acceleration the following steps should work:

1- Install CUDA toolkit: link
2- Uninstall llama-cpp-python: & 'C:\Users\Desktop\Dot\resources\llm\python\python.exe' -m pip uninstall llama-cpp-python
3- Reinstall it with the following command: & "C:\Users\Desktop\Dot\resources\llm\python\python.exe" -c "import os; os.environ['CMAKE_ARGS'] = '-DLLAMA_CUBLAS=on'; os.environ['FORCE_CMAKE'] = '1'; import pip._internal; pip._internal.main(['install', '--upgrade', '--force-reinstall', 'llama-cpp-python', '--no-cache-dir'])"

@nigelp
Copy link

nigelp commented Mar 17, 2024

Hmm..thanks. You had an extra " after the python.exe but even when I fixed that it failed again. You're probably going to need the whole file this time. :)
llamafail.txt

@nigelp
Copy link

nigelp commented Mar 18, 2024

OK fixed it, got it working, kind of - thanks to Claude 3 Opus :) Reinstalled CUDA and at least it seems to access something.

BUT it can't access the PDFs. Tried both local and the Big Dot and both failed. Big dot hallucinated an answer after saying "I'm unable to access or view the Q Star document directly. However, I can share with you..."

@alexpinel
Copy link
Owner

Nice! That is really interesting, big dot's answer makes sense as it is meant for general use and is not aware of the documents, did Doc Dot give any answer at all or was it stuck again?

Also how are you finding Claude 3? Is it really better than GPT4?

@nigelp
Copy link

nigelp commented Mar 18, 2024

Hmm...the Doc Dot answer was three lines of lame. :) Started with "I cannot directly access or summarize a PDF from this text alone. However, I can tell you that..." It's a shame because the UI and potential for adoption is really good. But it's not reading the doc at all it seems.

Claude Opus is SO much better than GPT4. It's my daily driver now.

@alexpinel
Copy link
Owner

Hmmmm there's a few options I can think of that could cause the issue here:

1- It does not have any access to the PDF: In such a case it would reply something along the lines of "I do not have the answer to that, the text only mentions "foo" and "bar""

2- Maybe it did not understand the prompt properly: Depending on the prompt Dot can get confused, because of the way Dot works it only has access to the text inside the document, so asking "what is doc X about?" might lead to the model searching for references of "doc X" inside the document itself and not finding anything. (Making it more aware of the documents themselves is something im trying to figure out but its turning to be quite a challenge)

3- It's replies are complete nonsense: In such a case it probably has more to do with the LLM itself than the embeddings but this only seems to happen when there are issues with the context length which as far as I understand shouldn't be the case in Doc Dot.

Do any of these align with what you are seeing? Also really tempted to get Claude 3, especially with all the uni coursework I have lately lol :)

@nigelp
Copy link

nigelp commented Mar 18, 2024

Interesting. Yes that would make sense about the prompting. I've just been asking 'summarise the document', and the results I get back are not useful. Do you have some sort of base prompt in the background you could tweak perhaps? Like 'you are a helpful assistant who can read the contents of documents and blah'? Or you're probably doing that already? Maybe you should offer some optimum prompts for users to try to get best results? Let me send over a full answer response

@nigelp
Copy link

nigelp commented Mar 18, 2024

And yes, go for Claude 3. It's so refreshing to get actual useful answers instead of the 'I'm sorry I am only...' rubbish from GPT4 nowadays.

@alexpinel
Copy link
Owner

The current base prompt is "Use the following pieces of context to answer the question at the end. If you do not know the answer, just say you don't know, don't try to make up an answer." And looking at it I can see why it might be confused by a 'summarise the document' as the prompt only mentions 'context'.

I will try to modify it and see if that changes anything, you can also modify it if you want as the python scripts are easily accessible from within the app files, they should be in \resources\llm\scripts and the system prompt is in line 77 of the docdot.py file.

Honestly this is the funniest part of working with LLMs (however annoying it can be), giving actual human like prompts that influence the way code behaves is quite surreal.

@nigelp
Copy link

nigelp commented Mar 18, 2024

OK, did some tests.

  1. Asked Claude to create a suitable prompt. Did a very long version. I asked it to shorten the prompt and got:

"You are an AI document analysis and question-answering system. Your task is to read and comprehend documents, store the information, and provide accurate, relevant answers to user questions based on the document's content. When a question is asked, analyze it, search your memory for the most relevant information, and formulate a clear, concise response. If the document does not contain sufficient information to answer the question, inform the user accordingly."

  1. Tried loading a document (15MB PDF) about Q-Star. Attached.

  2. Asked this question: "list out the key points in this document"

  3. Got this answer (much hallucinated):

To list out the key points in this document, I would need to carefully read and understand the content of the document. Some potential key points from the provided text include:

Scott Aaronson wrote a letter to his 11-year-old self.
He discussed how he had always been interested in science and technology.
He mentioned that he had recently started learning about quantum mechanics.
He expressed excitement about the potential discoveries and innovations that could come from this field of study.
Alberto Romero posted about the general response to this text.
The response was generally positive, with many people expressing excitement and inspiration from Scott Aaronson's letter to his younger self.
Some people also shared their own experiences and stories related to science, technology, and personal growth.
Overall, the response was a testament to the power of curiosity, learning, and self-reflection in shaping our lives and the world around us.

I think the LLM is not up to the task.
q star info.pdf

  1. Interesting fact: this is the most information I've got out of Dot so far. So the prompt seems to have improved the response. Except it's completely bogus! :)

@alexpinel
Copy link
Owner

That is really interesting thank you very much for your help! But yeah the context length of local models is not yet up to the task :/
At only 8000 tokens of context length it is taking its answer from the chunk of text it considers 'most relevant' to the question asked so in cases where lots of information is required it is no powerful enough...

Of course this should get better as LLMs become more advanced and I'd expect big improvements at the pace things are going. Something that could be done would be adding support for non-local LLMs using the OpenAI API for example, this would massively increase the context length but I'm not sure if that makes much sense as there are already tools that do that.

@nigelp
Copy link

nigelp commented Mar 19, 2024

That's what I thought might be the problem. Have you tried other models? Or maybe an API solution to an open source model? I'm running openhermes-2.5-mistral-7b.Q3_K_M locally via Koboldcpp and it's quite fast. But I guess there's added load with vision?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants