Really really big worlds?

Matt Helliwell

New Member
Having come from Minecraft, I've just started looking at Terasology. In Minecraft I've programmatically generated a large world - it takes 3.5TB on the disk. (It's actually a 1:1 scale map of Great Britain, complete with houses, rivers etc etc).

Would it be possible to generate something similar using Terasology? I'd like to do it if only to get rid of the height restrictions I had to work around in Minecraft.

I've been browsing through the docs and there are a couple of issues I'm not clear on:
1. Will Terasology work with such a large world?
2. Will it take up a similar amount of space? If the space requirements double I'm a bit stuffed.
3. Can I get the generation to work like Minecraft where I can generate the whole world and save the result to a file? The generation is very expensive in terms of CPU so I want to generate the whole lot on a big fat server in advance.

Thanks for any pointers.

Matt
 

Cervator

Org Co-Founder & Project Lead
Contributor
Design
Logistics
SpecOps
Hey @Matt Helliwell - good questions! To my knowledge nobody has tried yet to generate such a vast world, so I'd be curious about the answers myself :)

Our worlds are stored a little differently, but I don't know if more efficiently or not. For us you'd end up with a ton of tiny little chunk files. From a quick sampling of a few saves I've got I see them as 75-250kb each. Do you have any way of estimating the per-chunk size the Minecraft version of your world is?

Doing some quick and really rough estimations:
  • GB is about 209,331 km2
  • Ignoring height for a moment our chunks are 32x32, so you'd need just past 31 chunks to make a km long row or about 977 for a km2
  • About 204424804 chunks would cover the flat representation of GB - 200ish million
  • Say 150kb for an average chunk (looks like it would usually be a little less) and you'd hit about 30 TB total without any sort of optimization (we've never had to worry about that before)
  • The actual number would probably be a fair bit higher if including height properly - but perhaps also a fair bit lower if we were to actually think about what we're storing and cut down on it (we've got some per-block data in there you could probably live without)
For provoking world generation to continue for a very long time you'd probably just have to programmatically make more and more of the world count as relevant for generation to trigger (usual relevancy is based on being within an active player's view distance). That should be quite doable with a little effort and maybe somebody familiar with our world gen on IRC/Slack/Discord.

One potential challenge would be what sort of format you'd want to be reading from. Usually worlds are defined by the usual sorts of random noise. We do support a heightmap-based approach that reads an image file. Maybe if you just have a reeeaaaally big image? Probably you've got some sort of geography-specific file format though, got any details on it?

Short answer is: yes, it should all be possible. BUT chances are it will need some work here and there as the goal ventures into unexplored territory (in the code) :)
 

Matt Helliwell

New Member
Thanks. I'd be reading from a mixture of Ordnance Survey data and Open Street Map data, both a vector data rather than rasters. I did something similar with Minecraft but was able to pre-generate the map so I didn't have to worry about performance etc too much. It took up 3.5TB in the end.

I'll download the source code and start poking around.
 

Cervator

Org Co-Founder & Project Lead
Contributor
Design
Logistics
SpecOps
Great! Please do let us know what you find. I'm very curious about where we can optimize as I do want to see huge worlds sometime.

One of our Google Summer of Code projects was on a concept of Sectors, which is meant as another new organizational layer that MC lacks, to help support bigger worlds. In short they're dynamic divisions of the world so distant parts that don't really need to know about each other can be almost entirely disconnected. You'd still tie together stuff like global chat, but two groups of fighting/building players in different parts of the world wouldn't need to know about each other at all, and the two areas could even be run by entirely independent server processes.

We have two data stores, chunks and entities (pretty much everything that isn't a primitive block). You'd still end up with a huge data store with chunks from pre-generating a giant world, but long term we'd find a way via Sectors for the server to only pick up the set of chunks from the store it cares about plus a subset of entities relevant there. There has also been some related discussion about a new world generation pass that would only prepare an approximation of distant areas rather than fully generate every chunk - but I'm not sure how that would work in your case where you have a huge static world to start with rather than noise. Maybe with better support for parallel generation / processing the CPU hit would be worth not storing all the things up front.

So as usual with our project: plenty of potential, but it may take a bit of effort to get there. The first thing needed is usually a use case, so step 1: check! :)
 

Skaldarnar

Badges badges badges badges mushroom mushroom!
Contributor
Art
World
SpecOps
Hi @Matt Helliwell - an interesting topic you brought up there. Are you still around playing with the code or did you already gave up?

I'm quite interested in the aspect of keeping the world generator running without the need of a player entity to actually walk around. We were lacking a concept for that some time ago, but made some progress along with the Sectors @Cervator mentioned.

In case you can read the geo data via some web API it might also be interesting to see whether that can be hooked up to be done dynamically (within the bounds of how world generation currently works). Instead of computing random noise or reading chunk data from a height map one would query for the geographical data.
 
Top