LAZ reading benchmarks

LAZ is my preferred point cloud format, as it is a lossless compressed format that is considerably smaller than uncompressed LAS. One thing that has always bugged me is that reading LAZ is quite slow, at least when using the official LASzip implementation with C++. There’s an implementation in Rust that can also be used with laspy and supports parallel reading. To see how the performance differs, I wrote short benchmarks in all three languages. The code can be found at the bottom of this article.

Benchmark results

I ran the benchmarks with LAZ file of exactly 100 million points, approx. 650 MB in size. The uncompressed LAS version is 3.3 GB in size and takes about 6.8s (15 million points per second) to read with LASlib/LASzip on my test system. Here are the results for the LAZ file:

  • C++/LASzip: 37.9s, 2.6 million points per second
  • Rust/las-rs/laz-rs with parallel reading enabled: 9.9s, 10.1 million points per second
  • Python/laspy with lazrs backend: 4.3s, 23.3 million points per second

It’s not surprising that C++ is the slowest, as LASzip reads point-by-point. Reading a LAZ file is about four times slower than reading a LAS file with the same contents. What surprised me is that Python is faster than Rust, despite both using the same backend. I discussed this with a developer who has much more experience with Rust than I do, and his explanation was that laspy operates directly on the uncompressed data, where las-rs converts it into a Rust data structure.

Just for fun, I also benchmarked the uncompressed LAS file: Python read it in 1.5s, Rust in 10.7s. This shows that laspy is very efficient in reading large data streams, whereas the bottleneck of las-rs appears to be the reformatting from raw data to a vector of structures.

 

Conclusion

When working with LAZ files, LASzip and C++ is clearly the slowest solution.The parallel capabilities of laz-rs speed up reading with both Rust and Python, with Python the clear winner here. This means that, if the other processing steps can be done using functions that are not hindered by Python’s interpreted nature, Python can beat the other two. It should be possible in principle to use laz-rs or laspy with C++, or use laz-rs directly in Rust instead of through las-rs.

There’s also one downside to the parallel read function of laz-rs: it requires loading the whole file at once, so will require more RAM than reading in chunks or by point.

Code

#include "lasreader.hpp"

#include <iostream>
#include <chrono>
#include <string>
using namespace std;

int bench(const char* filename)
{
    auto start_time = chrono::high_resolution_clock::now();
    LASreadOpener lasreadopener;
    lasreadopener.set_file_name(filename);
    if (!lasreadopener.active())
    {
        fprintf(stderr, "ERROR: could not open lasreadopener\n");
        return -1;
    }
    LASreader* lasreader = lasreadopener.open();
    if (lasreader == 0)
    {
        fprintf(stderr, "ERROR: could not open lasreader\n");
        return -1;
    }

    size_t counter = 0;
    while (lasreader->read_point())
    {
        counter++;
    }

    lasreader->close();
    delete lasreader;

    auto end_time = chrono::high_resolution_clock::now();
    auto runtime = end_time - start_time;
    double elapsed_time = runtime / chrono::milliseconds(1);

    cout << "points read: " << counter << endl;
    cout << "time elapsed [s]: " << (elapsed_time/1000.0) << endl;
    cout << "points per second: " << counter / (elapsed_time/1000.0) << endl;

    return 0;
}
use std::time::{Instant};
use las::{Reader,Point};

fn bench(filename:&str)
{
    let start = Instant::now();
    let mut reader = Reader::from_path(filename).unwrap();
    let mut points: Vec<Point> = Vec::with_capacity(reader.header().number_of_points() as usize);
    let _ = reader.read_all_points_into(&mut points);
    let elapsed = start.elapsed();

    let counter = points.len() as f64;
    println!("points read: {}",counter);    
    println!("time elapsed [s]: {}",elapsed.as_secs_f64());
    println!("points per second {}",counter/elapsed.as_secs_f64());

}
import laspy
import time

def bench(filename:str):
    start = time.time()
    las = laspy.read(filename)
    end = time.time()
    count = las.x.shape[0]
    print('points read:',count)
    print('time elapsed [s]:',end-start)
    print('points per second:',count/(end-start))

 

Leave a comment

Your email address will not be published. Required fields are marked *