llama-utils¶
LlamaIndex utility package
Current release info¶
Name | Downloads | Version | Platforms |
---|---|---|---|
llama-utils - Large Language Model Utility Package¶
llama-utils is a large language model utility package
Main Features¶
- llama-index
Future work¶
- Developing a DEM processing module for generating the river network at different DEM spatial resolutions.
Installing llama-utils¶
Installing llama-utils
from the conda-forge
channel can be achieved by:
conda install -c conda-forge llama-utils=0.1.0
It is possible to list all the versions of llama-utils
available on your platform with:
conda search llama-utils --channel conda-forge
Install from GitHub¶
to install the last development to time, you can install the library from GitHub
pip install git+https://github.com/Serapieum-of-alex/llama-utils
pip¶
to install the last release, you can easily use pip
pip install llama-utils==0.1.0
Quick start¶
- First download ollama from here ollama and install it.
- Then run the following command to pull the
llama3
modelollama pull llama3
- Then run ollama server (if you get an error, check the errors section below to solve it)
Now you can use the
ollama serve
llama-utils
package to interact with theollama
server
from llama_utils.retrieval.storage import Storage
STORAGE_DIR= "examples/data/llama3"
storage = Storage.create()
data_path = "examples/data/essay"
docs = storage.read_documents(data_path)
storage.add_documents(docs)
storage.save(STORAGE_DIR)
Errors¶
You might face the following error when you run the ollama serve
command
Error: listen tcp 127.0.0.1:11434: bind: Only one usage of each socket address (protocol/network address/port) is normally permitted.
11434
is already in use, to solve this error, you can check which process is using this port by running the following command
netstat -ano | findstr :11434
TCP 127.0.0.1:11434 0.0.0.0:0 LISTENING 20796
taskkill /F /PID 20796
SUCCESS: The process with PID 20796 has been terminated.
- Then you can run the
ollama serve
command again, you should see the following output2024/11/22 23:20:04 routes.go:1189: INFO server config env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_DEBUG:false OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:C:\\Users\\eng_m\\.ollama\\models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:0 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://*] OLLAMA_SCHED_SPREAD:false OLLAMA_TMPDIR: ROCR_VISIBLE_DEVICES:]" time=2024-11-22T23:20:04.393+01:00 level=INFO source=images.go:755 msg="total blobs: 28" time=2024-11-22T23:20:04.395+01:00 level=INFO source=images.go:762 msg="total unused blobs removed: 0" time=2024-11-22T23:20:04.397+01:00 level=INFO source=routes.go:1240 msg="Listening on 127.0.0.1:11434 (version 0.4.1)" time=2024-11-22T23:20:04.400+01:00 level=INFO source=common.go:49 msg="Dynamic LLM libraries" runners="[cpu cpu_avx cpu_avx2 cuda_v11 cuda_v12 rocm]" time=2024-11-22T23:20:04.400+01:00 level=INFO source=gpu.go:221 msg="looking for compatible GPUs" time=2024-11-22T23:20:04.400+01:00 level=INFO source=gpu_windows.go:167 msg=packages count=1 time=2024-11-22T23:20:04.400+01:00 level=INFO source=gpu_windows.go:214 msg="" package=0 cores=8 efficiency=0 threads=16 time=2024-11-22T23:20:04.592+01:00 level=INFO source=types.go:123 msg="inference compute" id=GPU-04f76f9a-be0a-544b-9a6f-8607b8d0a9ab library=cuda variant=v12 compute=8.6 driver=12.6 name="NVIDIA GeForce RTX 3060 Ti" total="8.0 GiB" available="7.0 GiB"
you can change the port by running the following command
ollama serve --port 11435