69 lines
4 KiB
Markdown
69 lines
4 KiB
Markdown
title: How i run my ceph cluster or how you can make a killer storage solutions for almost free
|
||
---
|
||
pub_date: 2020-01-08
|
||
---
|
||
author: ungleich
|
||
---
|
||
twitter_handle: ungleich
|
||
---
|
||
_hidden: yes
|
||
---
|
||
_discoverable: no
|
||
---
|
||
abstract:
|
||
I wanted to store some data data and this is what I've came up with.
|
||
---
|
||
|
||
body:
|
||
|
||
Low cost, high tech datastorage with ceph
|
||
First of all why would you run a ceph cluster? Its complex, a bit time consuming and easier to lose data than using zfs or ext4.
|
||
My reasons:
|
||
its very easy to expand/shrink
|
||
manage all your data/disks from 1 host (can be a security risk too)
|
||
its fun
|
||
we have it in production and it scales well
|
||
unifying the physical
|
||
|
||
Step 1 :
|
||
Find your local hw dealer.
|
||
Second hand sites can be a good source, but good deals are rare. My tactics on ricardo.ch is: server & zubehor, filters: used, auction, max 1,5kchf, sorted with the ending soonest first) link : https://www.ricardo.ch/de/c/server-und-zubehoer-39328/?range_filters.price.max=1527&item_condition=used&offer_type=auction&sort=close_to_end
|
||
Nearby dangerous material (ewaste) handler companies can be a goldmine. Big companies cannot just throw used hardware out as regular waste becuse electronics contains a little amount of lead (or some other heavy metal). So big compnies sometimes happy to sell it as a used product for cheap and ewaste companies are happy if they get more money than the recycled price / kg which is very-very low
|
||
|
||
low quality (core2duo era) pc-s also suffice, but you wont be able to do erasure coded pools are they use a ton of processing power and ram.. be careful with the ram, if you run out of swap/ram, your osd process will be killed, learnt it the hard way. also sometimes recovery uses more ram than usual so keep some free ram for safety.
|
||
Put 10G nics on your shopping list, for performance, its absolutely crucial. I've started without it, and its certainly doable but it wont perform well. A little hack is to pick up gigabit nic cards (as some people give them away for free), and put them in an lacp bond. Note here: lacp doesnt make a single connection's speed better, the benefit is only realized at parallel connections.
|
||
If you dont care or have equal disks by size or speed no worries, ceph will happily consume everything you feed to it (except smr disks* or strictly only for frozen data) One hack is to snag some old/low capacity disks for free. If you do everyting right you can surpass ssd speeds with crappy spinning rusts. Worried about disks dying? Just have higher redundancy levels (keep 2 extra copies of your data)
|
||
My personal approach is to have coldish data 2 times, hot data like vms 3x and 1 extra copy of both on non-ceph filesystem.
|
||
You can also group disks by performance/size. Ideally the disks should be uniform in a ceph device class, and equally distributed between hosts.
|
||
Avoid hardware raid, use cards that allow full control for to os over the disks. If you must use hw raid, raid0 is the way.
|
||
|
||
Install:
|
||
You can check out my ugly install script that meant to bootstrap a cluster on a vm.
|
||
tested on an alpine vm with an attached /dev/sdb datablock (don't use legacy ip (ipv4))
|
||
|
||
apk add bash
|
||
wget http://llnu.ml/data/ceph-setup
|
||
bash ./ceph-setup $ip_address_of_the_machine $subnet_that_you_will_plan_to_use
|
||
|
||
Operation:
|
||
I've never prepared a disks manually yet which i should definetly review, because Nico wrote amazing helper scripts which can be found in our repo: https://code.ungleich.ch/ungleich-public/ungleich-tools.git
|
||
Some scripts still need minor modifications because alpine doesnt ship ceph init scripts yet. For the time being I manage the processes by hand.
|
||
|
||
Alpine's vanilla kernel doesnt have rbd support compiled in it atm, but for any kernel that doesnt have the kernel module, just use rbd-nbd to map block devices from you cluster.
|
||
|
||
|
||
|
||
|
||
* https://blog.widodh.nl/2017/02/do-not-use-smr-disks-with-ceph/
|
||
|
||
Some useful commands:
|
||
|
||
|
||
|
||
|
||
Today we’re at Leipzig, Germany. Last day of [36c3](https://events.ccc.de/category/congress/36c3/), Chaos Communication Congress.
|
||
|
||
![](/u/image/xr-green.jpg)
|
||
|
||
## The earth is getting hotter
|
||
|