Despite the article name "Axelspace Employee Interview Series," this post does not feature anyone in particular. Instead we focus on the DevOps engineering of AxelGlobe: the Earth observation infrastructure service of Axelspace, offering satellite images and analytical data to our users.
Our satellites produce raw captured image data, and this raw data goes through several processes, such as the Data Processing Pipeline and the Analytics Pipeline, to deliver the final product of satellite images and satellite imagery analytical data.
Without these processing pipelines no satellite images or related products can be delivered to users, and therefore, it is no exaggeration to say that DevOps engineers play one of the most significant roles in developing a critical function within AxelGlobe.
If you are a software engineer with zero space-industry experience but are familiar with DevOps or cloud/infrastructure engineering, perhaps reading this article will help you realize that your software experience may be useful at a space start-up like Axelspace! 😎
Let's find out what our software engineers do to deliver AxelGlobe products from space! 🪐
What is the Data Processing Pipeline?
The Data Processing Pipeline is basically a series of steps processing digital values acquired from satellites, such as Image Raw Data and Satellite Logs, and generating data sets (i.e., Imagery Product) for users to use in their own applications.
The commonly applied processing pipeline used for optical satellite imagery is consisted of mainly three steps:
(1) Optical satellite imagery is generated by observing the value of solar radiation from a satellite's payload optical sensor. This value can be affected by multiple factors such as the device sensitivity, particles on the telescope lens, and the atmosphere between the surface and the satellite. A Radiometric Calibration corrects errors in the solar reflectance caused by the above factors by utilizing the solar radiation value observed from the surface of reference.
(2) A Geometric Calibration is the process of estimating the relative location on Earth in reference to the satellite image taken. The coordinates of the captured image are calculated based on the value of satellite's orientation relative to the Earth's surface. The system also compares the GCPs (Ground Control Points which have known coordinates) in the captured and reference images to estimate the location information with higher accuracy.
(3) With Spatial Resolution Enhancement and Sharpening, we generate 2.5m data from 5m resolution data. There are many techniques you can use to generate such kinds of data. For example, pansharpening combines 2.5m panchromatic band data with 5m multispectral band data to generate 2.5m multispectral band data.
What is the Analytics Pipeline?
To promote the further usage of satellite imagery, AxelGlobe offers satellite imagery analyses, which are generated by the Analytics Pipeline. One example of analytical data we are currently working on is called Cloudless Basemap.
For Cloudless Basemap, the Analytics Pipeline detects cloud pixels by inputing multiple satellite images of the same location taken at different times during the period of interest and then in the process of image composition, it generates a single composite image which has removed clouds as much as possible.
Why is Cloudless Basemap helpful?
Generally speaking, 40% of optical satellite images are said to be covered by clouds and this percentage is especially higher in geographically or seasonally rainy areas. Users may not be able to see what they want to see on the satellite images if the ground is covered by clouds, making it difficult to use their satellite imagery to the fullest. Some users also specifically request images that are as cloud-free as possible regardless of the exact time the image has been taken (i.e., satellite images used for map services on electronic devices like smartphones).
Providing cloudless basemaps on a frequent basis enables users to obtain data on the differences in a location over multiple points in time such as the construction of new road infrastructure and the growth of crops. Cloudless Basemap clarifies the information that did not exist or was difficult to obtain in the past due to unclarity caused by clouds and enables users to make decisions accordingly.
In response to this day and age where space data is more commonly used in various industries than ever before, AxelGlobe aims to satisfy the world's demand for large scale ground surface data (i.e. the Amazon rainforest, all of Japan, etc.) by continuously meeting quality standards with Cloudless Basemap.
What are some of the challenges DevOps engineers face in AxelGlobe?
One of the technical challenges DevOps engineers face is designing pipeline systems for large-scale data processing. They must build infrastructure that dynamcally allocates the appropriate computer resources for data processing and algorithm execution based on the amount of input data. If the data can be processed in isolation, the pipeline must be designed in a way to process the inputs in parallel to shorten the latency in providing the data. The pipeline must also be scalable and flexible when the data processing method itself is changed so that the old and new processes can be run in parallel simultaneously to compare the results.
In order to achieve these goals, it is necessary not only to have scalable infrastructure systems, but also to reach a consensus on the design with the engineers developing the algorithm. The input and output of the algorithm must be defined at an appropriate level of abstraction, and the data acquisition and persistence must be flexible enough to change based on requirements.
Designing and implementing how data is to be persisted and in what format is also a challenging task. The amount, format, and retrieval method required will vary depending on the use case. Persisting data for each use case will require corresponding storage costs. Flexible acquisition methods must be developed while keeping storage costs low. Working with the team to discuss solutions to these challenges will be essential to this role 👥
What kind of tools, libraries and languages do DevOps engineers use?
Most of the optical satellite imagery data processing systems operate on cloud services. The level of each skill set differs from person to person but below are the rough tech stacks our engineers use.
Data processing: Amazon ECS/Fargate, AWS Lambda, Amazon EC2
Data pipeline monitoring and visualization: Amazon CloudWatch, Amazon RDS Performance Insights:
Data persistence: Amazon S3、Amazon RDS Postgresql & PostGIS
AuthN/AuthZ access control: AWS IAM
Data formats: GeoTiff, JSON, GeoJSON
Infrastructure as Code: Terraform
Python, Bash
CI/CD: GitHub Actions
Additionally, although not a requirement, having an interest and/or understanding of the following technologies will come in handly when communicating with team members as we primarily build operational systems for image processing 🙆
Computer vision (OpenCV)
Geospatial data libraries such as GDAL, rasterio
ML/DNN libraries such as PyTorch, Tensorflow, scikit-learn
Join Axelspace to make AxelGlobe become the next-generation infrastructure!
If any of the readers felt:
"The challenges AxelGlobe face sound exciting" "I might be interested in hearing more about the role because I have the basic cloud/infrastructure/DevOps skills" "I'm interested in knowing more about the team" "Perhaps this position can help me increase my technical growth and market value"
then go ahead and hit the button below to learn more! 👇