Tweaking Chunk Storage

Panserbjoern

Member
Contributor
Architecture
Cervator edit: Moved to Core Projects. Might need a thread redo sometime to be more up to date, tons of changes since @Panserbjoern started with sparse chunks, now we have stacked chunks and so on too :)
Skaldarnar edit: Moved to incubator, added header

Name: Chunk Storage
Summary: Several ideas on how to improve storage with regard to memory usage. Initial attempt was additional »Sparse Chunks« to save some memory. Currently i am working on improved serialization of chunks and tera arrays.
Scope: Core
Current Goals: The current goal is to implement improved serialization for whole chunks including support for protobuf. This also touches caching of chunks at runtime and the storage format for entire worlds.
Phase: Implementation
Curator: Community
Related: Chunk Visualizer http://forum.movingblocks.net/threads/chunk-storage-sparse-chunks.665/page-2#post-6426

Hi all

In the past two weeks or so i was working on sparse chunks. The current implementation is working with all terrain generators and gives a very signifficant reduction in memory consumption.

Most chunks can be compressed by about 80 to 90 percent. And on my machine i have no signifficant impact on performance!

The implementation is based on four important classes:
  • TeraDenseArray8Bit --> TeraSparseArray8Bit
  • TeraDenseArray4Bit --> TeraSparseArray4Bit
When a chunk is initially generated it uses the dense arrays for performance reasons. So chunk generators run without performance loss. As soon as the state of the chunk is set to COMPLETE, the chunk switches to the sparse arrays and throws away as much redundant data as possible. This works for block, sunlight, light and liquid data. Especially light and liquid data can be reduced by up to 99% in most cases. At runtime the sparse arrays are extended as necessary when the world is changed.

There is still some tuning possible. Currently, chunks are compressed only once immediately after their state was set to COMPLETE. After that, they never get recompressed. Also, if a sparse array is completely filled with non-redundant data it uses more memory than its dense version. Switching back to its dense version in such a case would be desirable.

Further, all the classes are designed to make it easy to add alternative implementations in the future. It's also easy to switch to a bigger datatype for block id's in the future.

And of course every time a chunk is compressed it generates a nice log message to indicate by how much its data has been reduced. :)

Please take a look at it and give me some feedback.


I've created a pull request:
https://github.com/MovingBlocks/Terasology/pull/391


Panserbjoern
 
Last edited by a moderator:

Esereja

Active Member
Contributor
How do you think this will affect now that we are moving to vertical chunks?
I assume it causes only space reduction.
 

Skaldarnar

Development Lead
Contributor
Art
World
SpecOps
Hey Panserbjoern
that sounds nice! I'll have a look at this tomorrow or so, just need to find the time.... Would it be an option to reload/recompress the chunks when saving the worlds/leaving the game? Or have I missed something regarding the generation?
 

Cervator

Org Co-Founder & Project Lead
Contributor
Design
Logistics
SpecOps
Thanks Panserbjoern :)

I'll leave the closer code review up to Immortius although I have one big suggestion - could you try to write a benchmark test for these new classes akin to this ? That might be a good way to get some statistics to compare them more exact to each other, the existing classes, and under various circumstances

Vertical chunks are something we're getting very close to needing. In short we currently only have one layer of 256 block tall chunks in the world and want to graduate to n chunks high with each chunk maybe being something different than 16x256x16. As part of that some sort of summary data (to catch sparse chunks that are most or all air, dirt, etc) could be useful especially when it comes to rendering distant chunks. Or maybe compressed chunks like this if you can easily extract summary data.

Edit:

Would it be an option to reload/recompress the chunks when saving the worlds/leaving the game? Or have I missed something regarding the generation?
I think we compress chunks when saved to disk, Kai Kratz worked on that some time ago. Not sure how these two forms of compression compare (certainly still need to be able to update the arrays as chunks change)
 

Immortius

Lead Software Architect
Contributor
Architecture
GUI
I guess I would like to see some read/write metrics and unit tests, but otherwise it look good.
 

Panserbjoern

Member
Contributor
Architecture
Skaldarnar, Cervator
To clarify: There are two kinds of compression regarding the chunks:
  1. When the world gets saved to disk the resulting binary representation of the chunks is compressed with a zip algorithm. That was already working and still is.
  2. The sparse arrays are a way to compress the chunks at runtime when they are still being used in the game. Thats what i am currently working on.
To further clarify the distinction between dense chunks and sparse chunks:
  • Dense chunks are not compressed in any way. Currently, every block in a dense chunk requires 2.5 bytes. Therefore, a chunk of size 16x256x16 blocks = 65536 blocks x 2.5 bytes/block = 163'840 bytes/chunk.
  • Sparse chunks are compressed at runtime. Chunks containing lots of air, water or rock can be significantly reduced in size. The current implementation is capable of reducing the size of such chunks by about 80% to 90%.
Currently, a chunk is always dense when it is generated for the first time. After the chunk has been completely generated, it is transformed into a sparse chunk to reduce the size it requires. When the player places or destroys blocks in the game it might be possible that such a chunk can be further compressed because of the changes the player did to it. Currently that does not happen because chunks are compressed only once after they have been generated.

I dont think, it would be good to recompress the sparse chunks when the world is saved, that would cause too much overhead and slow the whole process down. In my opinion it would be better to recompress them when they are loaded from disk when the game is resumed.

Cervator, Immortius
I will extend the existing ChunkCachePerformanceTest to test both dense and sparse chunks and i will add a benchmark to compare dense to sparse chunks both in size and performance. I am interested in these metrics too! :)
 

Cervator

Org Co-Founder & Project Lead
Contributor
Design
Logistics
SpecOps
Pulled! Many thanks, and likewise to you on the holidays :)

Are you up for some related challenges in that area? :D

We should probably get you an incubator thread already on chunk storage. Some thoughts on future improvements:
  • I liked your mention on GitHub of making what TeraArrays are tied to Chunk more durable/flexible vs. overall world storage. Able to come up with a design to include chunk additions like that in a module?
  • Any potential for more array flavors? We've got Dense and Sparse, how about an even smaller summarized array/chunk for distant chunks that don't really need all its data to figure out how to render a distant representation of said chunks?
  • Vertically layered chunks :D This needs an official thread already
  • My ever-hawked Block Dynamics / Integrity / support for fancier liquid systems.
  • If 4-bit eats more performance than 8 bit, can we swap to all 8-bit and count on the compression to keep memory usage in check?
I need to make some incubator threads ... any items you like the look of especially? ;)
 

Panserbjoern

Member
Contributor
Architecture
Cool! :D Thanks for pulling! :omg:

I liked your mention on GitHub of making what TeraArrays are tied to Chunk more durable/flexible vs. overall world storage. Able to come up with a design to include chunk additions like that in a module?
Can you elaborate on that? What do you mean with "chunk additions" and "module"?

Any potential for more array flavors? We've got Dense and Sparse, how about an even smaller summarized array/chunk for distant chunks that don't really need all its data to figure out how to render a distant representation of said chunks?
We can certainly use tera arrays fo that. I think it would require some extensions to local chunk provider to allow for asynchronous calculation of chunk summaries as well as an additional chunk store for chunk summaries. What do you think?

Vertically layered chunks :D This needs an official thread already
How "desperate" do you need vertically layered chunks? I would be very interested in that topic though it will require some big changes to the whole chunk implementation and handling and even to the existing algorithms for map generation and lighting/liquids. There is still work to do with the sparse arrays (see below) and i would like to finish that before i start to work on such a big issue.

My ever-hawked Block Dynamics / Integrity / support for fancier liquid systems.
Whats that?

About 4-bit versus 8-bit, the chunk deflation can usually reduce the chunk data so signifficantly that it doesn't matter whether we use 4-bit or 8-bit. You get better compression with 4-bit of course, but it's not very much.

It's easy now to switch to shorts as block ids. That *might* break existing worlds though... Not sure yet.

Anyway, there is still some work to do with the sparse arrays.
  • Currently the chunks get deflated at the very moment and in the very thread where the state of the chunk is set to COMPLETE. That should be changed. I want to add a ChunkTask to LocalChunkProvider so deflation is executed asynchronously. (The current solution might cause some performance issues)
  • The whole sparse chunks stuff should be configurable on a per world basis. A GUI reachable through the create world dialog would be very cool. It should include at least the following:
    • Select default tera array implementations used at chunk creation.
    • Enable/disable chunk deflation.
    • Select the default chunk deflator used to deflate chunks.
    • Enable/disable logging of chunk deflation.
    • (Would also allow to switch between 4-bit and 8-bit implementations)
  • Also i would like to do some experiments on a very special kind of tera arrays. They would be read only and they would throw an exception on write access. The chunk would handle these exceptions and switch to a writable, less compressed implementation as soon as someone tries to write into a read only implementation. That would allow even greater compression of chunks! Altough i have no idea how that would impact performance.
  • I have to implement 16-bit dense and sparse arrays for storing 16-bit block id's. Additionally there should be a special 16-bit sparse array using 8-bit internally if there are no more than 256 different block ids in the chunk. I guess that will be the case in most chunks, so that should yield signifficant memory reduction compared to plain 16-bit representation. That would require exception based switching of array implementations though.
 

Panserbjoern

Member
Contributor
Architecture
Oh yeah, about the protobuf stuff. I am not sure whether it makes sense just now to implement that. It will be necessary to have protobuf implemented for every tera array variation i think. And thats not finished yet...
 

dei

Java Priest
Contributor
Design
Art
cool stuff Panserbjoern! I began to implement these sparse arrays too but ran out of time. Will review your code with pleasure when I find some minutes of time...

Dumb question: Don't know the current detailed architecture, but couldn't the protobuf stuff be implemented using the general terraArray interface? So we could save and load without even knowing the implementation. Or would that be a problem performance-wise?
 

Panserbjoern

Member
Contributor
Architecture
Hi dei, thank you very much! Well, theoretically it would be possible to store the data of the arrays without knowing the exact details of the implementation. They would just be stored as dense arrays. But when you load a saved world the loaded arrays would also be dense arrays because we dont know the exact implementation they used to be. So it would also be necessary to deflate the chunks again when they are loaded. And it would also mean, that we have to save much more data than actually necessary.
 

Immortius

Lead Software Architect
Contributor
Architecture
GUI
Yeah, the protobuf format would ideally be something like:

Code:
message ChunkMessage {
    optional Vector3iData pos = 1;
    optional TeraArray blockData = 2;
    optional TeraArray liquidData = 3;
    optional TeraArray sunlightData = 15;
    optional TeraArray lightData = 16;

    extensions 5000 to max;
}
 
message TeraArray {
    enum Type { DENSE_8 = 1, DENSE_4 = 2, SPARSE_8 = 3, SPARSE_4 = 4, etc}
    optional Type type = 1 [default = 1];
    optional boolean compressed = 2;
    optional Vector3iData dimensions= 3;
    // Probably some type specific values here, but we at least need
    repeated bytes data = 4;
 
    extensions 5000 to max;
}
So all the different types of TeraArray would be known, and split by a type enum. Loading should be as direct as possible - for Dense arrays just plonking the byte data directly into the array. For sparse it may be a set of arrays. The serializer will need to know about the different implementations, but the implementations won't need to know about the serializer beyond providing the necessary means of obtaining the raw data and creating new arrays from the raw data.
 

Cervator

Org Co-Founder & Project Lead
Contributor
Design
Logistics
SpecOps

Immortius

Lead Software Architect
Contributor
Architecture
GUI
Basically what you mean is custom additional data at a per-block data. Something like each chunk having a mapping of data-name -> TeraArray for additional stuff, that can be defined through modules/mods. And then the ability to have additional generation phases working similar to the lighting and liquid systems, specify which ones are propagated in multiplayer to clients, and so forth.

I agree we should have it, but it is a measure of last resort. In the first place if the data can be stored in the block definition that is preferable, as that means the data is stored once per block type at most. If the information is only appropriate for rare blocks (like player-placed devices) or temporary (block damage), than using entities to store it is a good second choice. If the data is independent of block definition, and reasonably prevalent, then per-block data may be the best choice.

Temperature is a good example, as I imagine it could be handled similar to lighting, with some consideration for a base temperature determined by general location in the world.
 

Panserbjoern

Member
Contributor
Architecture
Cervator, Immortius

1. I have implemented some additions to sparse chunks. Chunk deflation is now executed asynchronously and it's now fully configurable.

There is the pull request: https://github.com/MovingBlocks/Terasology/pull/400

2. About block dynamics. I can implement a per-block-data-registry which allows mods to request per-block-data with different bit-width (4, 8, 16). It could even support specialized tera array implementations and deflation algorithms supplied by the mod. That would break existing worlds again, thats unavoidable though.

3. I could even change the existing block, sunlight, light and liquid data to work with the same mechanism. The data would just be requested by the engine itself. That's preferrable i think, because it would simplify and generalize the codebase. That would also simplify future extensions of the game and the development of mods!

By the way, i am not sure whether it would be better to create a specialized format to store chunk data instead of using protobuf. I would like the tera arrays to know how to store their data, because that would probably be necessary if mods should be able to supply their own tera array implementations.

What do you think about that?
 

Cervator

Org Co-Founder & Project Lead
Contributor
Design
Logistics
SpecOps
Nicely done! And fast, too :)

Sounds good to me and if deflation of 8 bit stuff can beat the memory saving of trying to use 4 bit and even process faster then that sounds like a win. I wouldn't worry about breaking saved worlds too much until we're in alpha/beta. Getting per-block stuff supported in mods/modules would be a huge win :ninja:

I'll wait to see if Immortius wants to chime in more on the technical review side. Would be good to be able to flag down Kai Kratz too since he did some chunk magic a while back too.

Edit: For Immortius - yeah, certainly want to limit stuff that's per block. That's why I started getting into all the madness about layering multiple purposes into one extra set to maximize the use in every block type scenario, at least for what I have in mind :)
 

Immortius

Lead Software Architect
Contributor
Architecture
GUI
3. I could even change the existing block, sunlight, light and liquid data to work with the same mechanism. The data would just be requested by the engine itself. That's preferable i think, because it would simplify and generalize the codebase. That would also simplify future extensions of the game and the development of mods!
For the really core bits - blocks and maybe lighting - I would be concerned about having an additional lookup step to access the data. I guess it could be set up so that the arrays themselves are obtained, which helps for any batch operations (many reads/writes) - although the loss of encapsulation would be a worry. Probably some way to provide views which work within the locks.

By the way, i am not sure whether it would be better to create a specialized format to store chunk data instead of using protobuf. I would like the tera arrays to know how to store their data, because that would probably be necessary if mods should be able to supply their own tera array implementations.

What do you think about that?
Hmm. Ultimately everything needs to be in protobuf as that is how it is communicated across the network. But each tera array could implement methods to serialize/deserialize to a raw byte stream - this would then be combined with an type identifier to determine which tera array to use (handled by a TeraArray factory that obtains all available TeraArrays via Reflections, and then synching ids over the network).

I would still suggest the tera arrays should use protobuf internally though - it does a good job of handling change over time, basic compression, and allowing tools to be created in potentially other languages to work with the data.
 
Top