Dolt — It’s Git for Data
Database like a repo
Dolt is a SQL database that you can fork, clone, branch, merge, push and pull just like a git repository. It’s easy to use, let’s see how to do it!
First of all, let’s install it globally : brew install dolt
To check if the installation was successful, run: dolt
After that, we can set a global configuration, with these two commands, below:
dolt config --global --add user.email YOU@DOMAIN.COMdolt config --global --add user.name "YOUR NAME"
Now we can create and initialize our first dolt project, so, create a new folder called state-pops
and inside this run dolt init
to set up a new dolt repo.
We can create a new table called state-population
and put the data inside, with the following commands:
dolt sql -q "create table state_populations (state varchar(14), population int, primary key (state))"dolt sql -q "insert into state_populations (state, population) values
('Delaware', 59096),
('Maryland', 319728),
('Tennessee', 35691),
('Virginia', 691937),
('Connecticut', 237946),
('Massachusetts', 378787),
('South Carolina', 249073),
('New Hampshire', 141885),
('Vermont', 85425),
('Georgia', 82548),
('Pennsylvania', 434373),
('Kentucky', 73677),
('New York', 340120),
('New Jersey', 184139),
('North Carolina', 393751),
('Maine', 96540),
('Rhode Island', 68825)"
If everything went well we will see this: Query OK, 17 rows affected
Using dolt sql
we can jump inside the instance, and see something like that
And make any kind of select we want
Now we can do our first commit the same way with the repo
dolt add .
dolt commit -m "initial data"
If we want to make changes to the data, for example, update the table and set the population to 0 where the state name starts with “New%”.
We can perform this update by always entering with the command dot sql
and update state_populations set population = 0 where state like 'New%';
At this point, we can see the differences with the previous data using the command dolt diff
Then we can do another commit as before and to see the history of our changes we use the command dolt log
We are now ready to push our first local repository to the remote.
So, we need to login into dolthub.com with dolt login
we are brought back to the registration page for login
So if we already have a user we can log in or register a new one. For convenience, I used the GitHub user and logged in.
Very important thing: after registering the new user or logging in with an existing one, you need to log in again from the terminal, as dolt tries to bind a key to our device, after doing that we should get a message like this
After that, we can create a new repository like this
And follow the instructions to push our local repository
And it’s done! We have our data on the remote repository!
We can interact with our data with the SQL console
Soon we will see how to perform all the normal operations that are done with repositories, such as creating new branches, merging and even importing data via CSV and working with other existing public datasets.
Yaaay! 🎉
I hope you learned some new stuff, thank you for your time !!!
Leave a comment if you want or ask some questions and if you like follow me!
Cheers!