Tshirtman

Well, it's a blog, you know the drill.

Feeling the Vibe

(edit, see update at the end)

The term vibe coding has been making waves recently, and i must admit, i was as worried and dismissive as anyone, but the tools are sure impressive. Not like you can give them all the work mindlessly, but they can do a lot, with only a limited amount of supervision, allowing to free up time to work on something else, even driving multiple of them at once.

There is the little problem of cost though, when using commercial tools, but this is not the problem we are solving today.

The other problem, is that if you want to work on two different things in the same repository, collisions are going to happen, unless you use git worktree, or similar tools, to have multiple copies of the code to work with, but that's a bit tedious to maintain, and takes time and space to setup.

That got me thinking about layers, like how docker build images in steps, and each step creates a file system layer only saving the files that are created/modified from the previous one, then presenting the result to the running container as a superposition of all of the layers. Could we use something similar?

Turns out on Linux you can mount an overlay with fuse-overlayfs (combined lower/upper/work dirs), and on macOS you get similar behavior through APFS clone-on-write (via cp -c), so with a little automation we can set up private, ephemeral views without touching the original source.

So with ChatGPT doing all the work, I threw around this little tool, ovlshell, usage is pretty brainless.

cd your_project
ovlshell # (or create an alias for it)
claude # or amp or codex or whatever tool you use, AI or not.
exit # or ctrl-d
# you will be prompted for deletion of the overlay directory structure,
# it's your responsibility to cleanup if you don't.

And while one session is chugging along in its isolated environment, you can start another in the same project directory, as all the changes are invisible to it, and another… and another, until you run out of tokens, money, memory, or brain capacity, that is.

Sessions live in directories in /tmp/, they get a random cute name unless you give them a custom one, and the default command is your $SHELL, but you can pass a custom one, while it's tempting to use claude or whatever as the command, it means that exiting it would kill the session (though cleanup requires confirmation), and that might cause regret, so having a shell layer in the middle seems prudent.

update 04/09/2025: well, trying it on MacOs made me reassess the using git worktree as a backend, and i was wrong, it's a lot faster than i though it would be, the only problem i have with it is the UX, not the performance, so wrapping it in the interface of ovlshell (which still needs a better name though), is still a good move.

Local “copilot” like development with Vim.

why?

As many people, I’ve been curious (but hesitant) to jump on the trend of using LLMs for coding, one of my reluctance is that I didn’t want to depend on a 3rd party service, paid or not, during my development, I know all things are by nature ephemeral, but I would like, if possible, my tools to stay in my control.

I’ve also not been too good as using a separate tool, as stoping my workflow, to ask a question to an LLM, giving context, etc, only made sense when I was hitting a stumbling block, and in that case, I should rather think and do research, than ask an LLM for a magical solution (though sometimes it can help), and I had the impression than the more useful case was for mundane things, when I know full well what to write, but an LLM can also pretty quickly see where this is going, and complete the idea, saving me a lot of typing.

So I was more tempted to use local models, than remote ones, and I wanted things to integrate with Vim (no, not neovim, for reasons i won’t get into now, I’m sticking with the traditional one, at least for now), functioning as a completion engine.

How

After exploring a few solutions, here is what I found to work decently for me.

  • installing llama-cpp to run models.
  • adding Plug 'ggml-org/llama.vim' to my ~/.vim/plugin/plug-list.vim
  • adding a llama-config.vim (see below) file in ~/.vim/plugin/ to set my preferences
  • adding a copillot (see below) script in ~/.local/share/bin/ to start the engine with my prefered parameters.

llama-config.vim

" put before llama.vim loads
" let g:llama_config = { 'show_info': 0 }
highlight llama_hl_hint guifg=#f8732e ctermfg=209
highlight llama_hl_info guifg=#50fa7b ctermfg=119
let g:llama_config = {
    \ 'endpoint':         'http://127.0.0.1:8012/infill',
    \ 'api_key':          '',
    \ 'n_prefix':         512,
    \ 'n_suffix':         128,
    \ 'n_predict':        128,
    \ 't_max_prompt_ms':  500,
    \ 't_max_predict_ms': 500,
    \ 'show_info':        1,
    \ 'auto_fim':         v:true,
    \ 'max_line_suffix':  8,
    \ 'max_cache_keys':   250,
    \ 'ring_n_chunks':    16,
    \ 'ring_chunk_size':  64,
    \ 'ring_scope':       1024,
    \ 'ring_update_ms':   1000,
    \ }

copillot

#!/usr/bin/env sh

# pretty slow supposedly better?
# MODEL="Qwen/Qwen2.5-Coder-32B-Instruct-GGUF"
# also a bit slow
# MODEL="ggml-org/Qwen2.5-Coder-14B-Q8_0-GGUF"
# pretty fast!
MODEL="Qwen/Qwen2.5-Coder-3B-Instruct-GGUF"
# really fast!
# MODEL="Qwen/Qwen2.5-Coder-1.5B-Instruct-GGUF"

PORT=8012 
BATCH_SIZE=2048
GPU_LAYERS=99
CTX_SIZE=0 # 0 = use model max context
CACHE_REUSE=256

llama-server \
    -hf $MODEL \
    --port $PORT \
    -ngl $GPU_LAYERS \
    -fa \
    -ub $BATCH_SIZE \
    -b $BATCH_SIZE \
    --ctx-size $CTX_SIZE \
    --cache-reuse $CACHE_REUSE

(need to run chmod +x ~/.local/bin/copillot)

How does it work together?

For now, I manually run copillot in a terminal when I need/want to, and mostly forget about it. Then I simply edit any file with vim, and the plugin will use the shared port to get suggestions. When I type in insert mode, the model will generate one, and the plugin will use virtual text to display it, at this point, I can either: – keep typing, ignoring it. – press to complete only the current line – press to insert the whole suggestion.

As I selected a small variant of the model, I trade accuracy for speed, the model is not going to suggest very smart things, but it’ll usually answer in much less than a second when I pause my typing, and since most of the code I type is not ground breaking, it often sees where I’m going, and can save me a few lines of typing (and the typoes that come with them), even if I might need to edit them (after all, I’m using vim, editing is what we are good at), and let’s not fool ourselves, I’d have to edit them anyway.

If I type from click import

my buffer immediately looks like this

from click import |command, option, argument
from typing import List
@command()
@option('--name', default='World', help='Name to greet')
def greet(name: str) -> None:                                                                                                                                                                                                                                                                                                                                                                                                                                                                       """Greet someone."""

While my cursor is still on the space after import I can decide to accept this suggestion, which will give me the start of a quick hello world with click, neat! If I accept it, I’ll get the rest of it as a followup suggestion.

But of course, that’s a very simple demo, if I have more context, with multiple buffers, classes defined in them, etc, it can relatively smartly use them and infill my current line depending on what’s being done elsewhere in the file. It’s not very smart, I still need to type some code (or sometimes, a comment) to indicate where I’m going, but I’m quite impressed by how much of the day to day stuff it can churn.

There is a rule though, when I get a completion, it should look like what I’m expecting, if not, I should be able to read and understand it (of course, one must understand the code they commit), and if not, I should really look up what part of it I don’t know, and see if it fits. The danger of “vibe coding” is that you get a lot of code you don’t understand and can’t debug, and that’s a terrible place to get your project to, it’s not really a new danger, copy/pasting code from somewhere, and tinkering to make it work has been the practice of many coders for many ways, and the cose of many regrets.

But sometimes, too, it does teach me a simpler way to do things, than I was about to do, and after checking that it really does work, I do appreciate it just like I would if a coworkers had shared it in a paring session.

Let's do a bit more testing, shall we?

Can we put an image in there

A sculpture of a baby elephant, sitting on the street, in London

A poster plastered on the wall, it's black and white, showing two hands holding a representation of the earth, engulfed in flames, around them, a series of large lines all pointing to the bottom center of the image, add to the impression of urgency. At the bottom, the lettering "you can panic". Someone wrote on the continents, with a small marker, "climate change is not real", the poster is slightly scratched, someone might have tried to remove it

Picture of a building, made of bricks, in london

These images are hosted on google photos, and integrated using markdown, the process to get a link to a google photo image takes a few steps, but is not too hard: – first you have to share a picture, or a series of pictures, with a public link – open that link, and select the (or a, if you shared mulitple ones) picture from the gallery to view it. – right click it, copy the link to the image, or open the image in a new tab, and copy the link from url (or right click) from there. – type ![]() in your blog post. – paste the link to the image inside the parentheses. – type an image description inside the brackets. – if you want the description to be visible when the mouse overs the image, you can add it again, after the url, inside quotes. – the end result will look like ![this is an image description](https://example.com/url/to/image.jpeg "this is the image title") – you are done!

Testing this nice piece of software.

Well, it's not vim, i can tell you that, but indeed it's still comfortable to write and read, certainly a good place to dump one's thoughts without too much complication.

I used to have a blog somewhere, should i try to unhearth these old posts? Not sure, a couple posts were interresting, i think, but half is probably mostly outdated, anyway.

Anyway, one thing i'm not totally sure about this software, is the localisation situation, i don't see anyway to set up the language in configuration, but it seems to display some parts of the UI in french (like dates), to please my browser's settings surely, but other elements seem to be in English.