Atmospheric Modeling

Table of Contents

Overview

The atmospheric modeling system is the foundation of accurate balloon trajectory prediction. It provides comprehensive atmospheric state data including temperature, pressure, density, winds, and humidity from the surface to the edge of space. The system employs a sophisticated hierarchy of data sources, from real-time weather APIs to validated atmospheric models, ensuring global coverage with optimal accuracy.

Why Atmospheric Accuracy Matters: A 1°C error in temperature or 1 m/s error in wind speed can result in landing position errors of several kilometers for a typical high-altitude balloon flight. The atmospheric state directly affects buoyancy, drag, and trajectory.

Data Source Hierarchy

The simulator implements a sophisticated fallback system to ensure atmospheric data is always available:

Priority Order:

  1. OpenWeather API - Global coverage, real-time data
  2. NOAA Weather API - US coverage, high accuracy
  3. GFS NOMADS - Global forecast model, GRIB2 format
  4. ECMWF Open Data - European model (planned)
  5. US Standard Atmosphere 1976 - Validated fallback
  6. NRLMSISE-00 - High altitude (>86km) extension

Data Source Selection Algorithm

def select_atmospheric_data_source(lat, lon, alt, time):
    """
    Intelligent data source selection based on availability,
    location, and quality metrics
    """
    # Try primary real-time sources
    if openweather_api.is_available():
        data = openweather_api.fetch(lat, lon, time)
        if data.quality_score > 0.8:
            return data, "OpenWeather API"

    # US-specific high-quality data
    if is_within_us(lat, lon) and noaa_api.is_available():
        data = noaa_api.fetch(lat, lon, time)
        if data.quality_score > 0.9:
            return data, "NOAA Weather API"

    # Global forecast model
    if gfs_nomads.is_available():
        data = gfs_nomads.fetch(lat, lon, time)
        if data.quality_score > 0.7:
            return data, "GFS NOMADS"

    # Fallback to standard atmosphere
    return standard_atmosphere.calculate(lat, lon, alt), "US Standard Atmosphere"

US Standard Atmosphere 1976

The US Standard Atmosphere 1976 provides a mathematical model of Earth's atmosphere based on average conditions at mid-latitudes. It defines temperature, pressure, and density as functions of altitude through a series of atmospheric layers.

Temperature Profile by Layer

The atmosphere is divided into seven distinct layers, each with its own temperature gradient:

Layer Altitude Range Name Temperature Profile Lapse Rate
1 0-11 km Troposphere $T(h) = 288.15 - 6.5h$ -6.5 K/km
2 11-20 km Tropopause $T(h) = 216.65$ 0 K/km
3 20-32 km Lower Stratosphere $T(h) = 216.65 + 1.0(h-20)$ +1.0 K/km
4 32-47 km Upper Stratosphere $T(h) = 228.65 + 2.8(h-32)$ +2.8 K/km
5 47-51 km Stratopause $T(h) = 270.65$ 0 K/km
6 51-71 km Lower Mesosphere $T(h) = 270.65 - 2.8(h-51)$ -2.8 K/km
7 71-86 km Upper Mesosphere $T(h) = 214.65 - 2.0(h-71)$ -2.0 K/km
where:
  • $T(h)$ = temperature as a function of altitude [K]
  • $h$ = altitude above sea level [km]

Pressure Calculation

Pressure is calculated using the barometric formula, which differs for isothermal and gradient layers:

For Isothermal Layers (constant temperature):

$$P(h) = P_0 \times \exp\left(-\frac{g_0M(h-h_0)}{RT}\right)$$
This exponential decay results from the hydrostatic equation $dP = -\rho g dh$ integrated under constant temperature conditions. The exponential form emerges because density is proportional to pressure at constant temperature.

For Gradient Layers (linear temperature variation):

$$P(h) = P_0 \times \left(\frac{T(h)}{T_0}\right)^{-\frac{g_0M}{RL}}$$
For layers with linear temperature variation $T = T_0 + L(h-h_0)$, the power law relationship emerges from integrating the hydrostatic equation. The exponent depends on the temperature gradient $L$.
where:
  • $P(h)$ = pressure at altitude $h$ [Pa]
  • $P_0$ = pressure at reference altitude [Pa]
  • $g_0 = 9.80665$ m/s² (standard gravity)
  • $M = 0.0289644$ kg/mol (molar mass of dry air)
  • $R = 8.3144598$ J/(mol·K) (universal gas constant)
  • $L$ = temperature lapse rate [K/m]
  • $h_0$ = reference altitude (layer base) [m]

Density Calculation

Air density is calculated from the ideal gas law:

$$\rho(h) = \frac{PM}{RT}$$
where:
  • $\rho(h)$ = air density at altitude $h$ [kg/m³]
  • $P$ = pressure at altitude [Pa]
  • $T$ = temperature at altitude [K]

Implementation Example

class USStandardAtmosphere1976:
    def __init__(self):
        # Layer definitions: (h_base[km], T_base[K], lapse_rate[K/km])
        self.layers = [
            (0,  288.15, -6.5),   # Troposphere
            (11, 216.65,  0.0),   # Tropopause
            (20, 216.65,  1.0),   # Lower Stratosphere
            (32, 228.65,  2.8),   # Upper Stratosphere
            (47, 270.65,  0.0),   # Stratopause
            (51, 270.65, -2.8),   # Lower Mesosphere
            (71, 214.65, -2.0),   # Upper Mesosphere
        ]

        # Physical constants
        self.g0 = 9.80665      # m/s²
        self.M = 0.0289644     # kg/mol
        self.R = 8.3144598     # J/(mol·K)
        self.P0 = 101325       # Pa (sea level)

    def get_temperature(self, altitude_km):
        """Calculate temperature at given altitude"""
        for i, (h_base, T_base, lapse_rate) in enumerate(self.layers):
            if i == len(self.layers) - 1 or altitude_km < self.layers[i+1][0]:
                return T_base + lapse_rate * (altitude_km - h_base)

        # Above 86 km
        return self.layers[-1][1]

    def get_pressure(self, altitude_m):
        """Calculate pressure at given altitude"""
        h_km = altitude_m / 1000.0

        # Find current layer
        layer_idx = 0
        for i in range(len(self.layers) - 1):
            if h_km >= self.layers[i+1][0]:
                layer_idx = i + 1
            else:
                break

        # Calculate pressure from sea level up through each layer
        P = self.P0
        h_current = 0

        for i in range(layer_idx + 1):
            h_base, T_base, lapse_rate = self.layers[i]
            h_top = self.layers[i+1][0] if i < len(self.layers)-1 else h_km
            h_top = min(h_top, h_km)

            if abs(lapse_rate) < 1e-10:  # Isothermal
                P *= np.exp(-self.g0 * self.M * (h_top - h_base) * 1000 / (self.R * T_base))
            else:  # Gradient layer
                T_top = T_base + lapse_rate * (h_top - h_base)
                P *= (T_top / T_base) ** (-self.g0 * self.M / (self.R * lapse_rate * 0.001))

            if h_top >= h_km:
                break

        return P

Real-Time Data Integration

OpenWeather API Integration

OpenWeather provides global atmospheric data with 3-hour forecasts and current conditions:

API Endpoints Used:

  • /data/2.5/weather - Current weather data
  • /data/2.5/forecast - 5-day forecast with 3-hour steps
  • /data/2.5/onecall - Comprehensive weather data
class OpenWeatherAPI:
    def __init__(self, api_key):
        self.api_key = api_key
        self.base_url = "https://api.openweathermap.org/data/2.5"

    def fetch_atmospheric_profile(self, lat, lon, time=None):
        """Fetch complete atmospheric profile from OpenWeather"""

        # Get current conditions
        current = self._fetch_current(lat, lon)

        # Get forecast data for vertical interpolation
        forecast = self._fetch_forecast(lat, lon)

        # Build atmospheric profile
        profile = AtmosphericProfile()

        # Surface conditions
        profile.add_level(
            altitude=current['main']['grnd_level'] if 'grnd_level' in current['main'] else 0,
            temperature=current['main']['temp'],
            pressure=current['main']['pressure'] * 100,  # hPa to Pa
            humidity=current['main']['humidity'],
            wind_speed=current['wind']['speed'],
            wind_direction=current['wind']['deg']
        )

        # Interpolate to standard pressure levels
        return self._interpolate_profile(profile, forecast)

NOAA Weather API Integration

NOAA provides high-resolution data for US locations through the National Weather Service API:

Data Products:

  • Radiosonde observations (twice daily)
  • Surface observations (hourly)
  • Model analysis fields
  • Gridded forecast data

Data Quality Metrics

Each data source is evaluated for quality using multiple metrics:

$$Q_{score} = w_1 \cdot C_{temporal} + w_2 \cdot C_{spatial} + w_3 \cdot C_{vertical} + w_4 \cdot C_{physical}$$
where:
  • $C_{temporal}$ = temporal coverage (0-1)
  • $C_{spatial}$ = spatial resolution quality (0-1)
  • $C_{vertical}$ = vertical resolution quality (0-1)
  • $C_{physical}$ = physical consistency checks (0-1)
  • $w_i$ = weighting factors (sum to 1)

GFS NOMADS Processing

The Global Forecast System (GFS) provides comprehensive atmospheric data in GRIB2 format through NOAA's NOMADS servers.

Pressure Levels

GFS data is provided at 26 standard pressure levels:

1000, 975, 950, 925, 900, 850, 800, 750, 700, 650, 600, 550, 500, 450, 400, 350, 300, 250, 200, 150, 100, 70, 50, 30, 20, 10 hPa

Variables Extracted

Variable Description Units Usage
TMP Temperature K Thermodynamic calculations
HGT Geopotential height m Altitude reference
UGRD U-component wind m/s Eastward wind
VGRD V-component wind m/s Northward wind
RH Relative humidity % Moisture content
ABSV Absolute vorticity s⁻¹ Rotation measure
DZDT Vertical velocity Pa/s Vertical motion

Pressure to Altitude Conversion

Converting pressure levels to geometric altitude uses the barometric formula:

$$h = \frac{T_0}{L} \times \left(1 - \left(\frac{P}{P_0}\right)^{\frac{RL}{g}}\right)$$
where:
  • $h$ = altitude [m]
  • $P$ = pressure at level [Pa]
  • $P_0 = 101325$ Pa (sea level pressure)
  • $T_0 = 288.15$ K (sea level temperature)
  • $L = 0.0065$ K/m (standard lapse rate)
  • $R = 287$ J/(kg·K) (specific gas constant for dry air)
  • $g = 9.80665$ m/s² (standard gravity)

GRIB2 Processing Implementation

class GFSNOMADSFetcher:
    def __init__(self):
        self.base_url = "https://nomads.ncep.noaa.gov/cgi-bin/filter_gfs_0p25.pl"
        self.pressure_levels = [1000, 975, 950, 925, 900, 850, 800, 750, 700,
                               650, 600, 550, 500, 450, 400, 350, 300, 250,
                               200, 150, 100, 70, 50, 30, 20, 10]

    def fetch_atmospheric_data(self, lat, lon, forecast_hour=0):
        """Fetch GFS data for specific location"""

        # Download GRIB2 file for the region
        grib_file = self._download_grib_subset(lat, lon, forecast_hour)

        # Parse GRIB2 data
        atmospheric_data = {}

        with pygrib.open(grib_file) as grbs:
            for grb in grbs:
                if grb.level in self.pressure_levels:
                    level_hPa = grb.level

                    if grb.name == 'Temperature':
                        atmospheric_data.setdefault(level_hPa, {})['temperature'] = grb.values
                    elif grb.name == 'U component of wind':
                        atmospheric_data.setdefault(level_hPa, {})['u_wind'] = grb.values
                    elif grb.name == 'V component of wind':
                        atmospheric_data.setdefault(level_hPa, {})['v_wind'] = grb.values
                    # ... process other variables

        # Convert to altitude-based profile
        return self._convert_to_altitude_profile(atmospheric_data, lat, lon)

Derived Parameters

From the basic GFS variables, we calculate additional parameters:

Wind Speed and Direction:

$$|v| = \sqrt{u^2 + v^2}$$ $$\theta = \text{atan2}(u, v) \times \frac{180}{\pi}$$

Air Density:

$$\rho = \frac{P}{RT}$$

Potential Temperature:

$$\theta = T\left(\frac{P_0}{P}\right)^{\frac{R}{c_p}}$$

Data Fusion Algorithms

When multiple data sources are available, sophisticated fusion algorithms combine them optimally:

Weighted Average Fusion

For overlapping data sources, we use quality-weighted averaging:

$$\phi_{fused} = \frac{\sum_{i=1}^{n} w_i \phi_i}{\sum_{i=1}^{n} w_i}$$

where weights are determined by:

$$w_i = \frac{Q_i^2}{\sigma_i^2} \times \exp\left(-\frac{\Delta t_i}{\tau}\right) \times \exp\left(-\frac{\Delta r_i}{L}\right)$$
where:
  • $Q_i$ = quality score of source $i$
  • $\sigma_i$ = uncertainty estimate
  • $\Delta t_i$ = time offset from requested time
  • $\Delta r_i$ = spatial distance from requested location
  • $\tau$ = temporal decorrelation scale (~3 hours)
  • $L$ = spatial decorrelation scale (~50 km)

Vertical Interpolation

Atmospheric data often comes at discrete levels. We use cubic spline interpolation in log-pressure coordinates:

$$\ln(P) = a_0 + a_1 z + a_2 z^2 + a_3 z^3$$
Log-pressure coordinates are used because atmospheric pressure decreases approximately exponentially with altitude, making linear interpolation more accurate in log space.

Gap Filling

When data is missing at certain levels, physics-based interpolation is used:

def fill_atmospheric_gaps(profile):
    """Fill gaps using physical constraints"""

    # Temperature: use moist adiabatic lapse rate if unstable
    if is_unstable(profile):
        lapse_rate = calculate_moist_adiabatic_lapse_rate(
            profile.temperature,
            profile.pressure,
            profile.humidity
        )
    else:
        lapse_rate = -6.5  # K/km standard

    # Fill temperature gaps
    profile.temperature = interpolate_with_lapse_rate(
        profile.altitude,
        profile.temperature,
        lapse_rate
    )

    # Pressure: use hydrostatic equation
    for i in range(len(profile.altitude) - 1):
        if np.isnan(profile.pressure[i+1]):
            dz = profile.altitude[i+1] - profile.altitude[i]
            T_avg = (profile.temperature[i] + profile.temperature[i+1]) / 2
            profile.pressure[i+1] = profile.pressure[i] * np.exp(
                -g * M_air * dz / (R * T_avg)
            )

Quality Control

Physical Consistency Checks

All atmospheric data undergoes rigorous quality control:

1. Hydrostatic Consistency:

Verify that pressure and temperature satisfy the hydrostatic equation:

$$\left|\frac{dP}{dz} + \rho g\right| < \epsilon$$

2. Temperature Gradient Limits:

Check that temperature gradients are physically reasonable:

$$-10 \text{ K/km} < \frac{dT}{dz} < 5 \text{ K/km}$$

3. Wind Shear Limits:

Ensure wind shear doesn't exceed physical limits:

$$\left|\frac{d\vec{v}}{dz}\right| < 0.1 \text{ s}^{-1}$$

Statistical Quality Metrics

Statistical measures identify outliers and anomalies:

def calculate_quality_metrics(profile):
    """Calculate comprehensive quality metrics"""

    metrics = {}

    # Vertical resolution quality
    dz = np.diff(profile.altitude)
    metrics['vertical_resolution'] = 1.0 / (1.0 + np.std(dz) / np.mean(dz))

    # Temperature smoothness
    d2T_dz2 = np.diff(np.diff(profile.temperature))
    metrics['temperature_smoothness'] = 1.0 / (1.0 + np.std(d2T_dz2))

    # Data completeness
    metrics['completeness'] = np.sum(~np.isnan(profile.temperature)) / len(profile.temperature)

    # Physical consistency
    violations = 0
    violations += check_hydrostatic_balance(profile)
    violations += check_temperature_limits(profile)
    violations += check_wind_shear(profile)
    metrics['physical_consistency'] = 1.0 / (1.0 + violations)

    # Overall quality score
    metrics['overall'] = np.mean([
        metrics['vertical_resolution'],
        metrics['temperature_smoothness'],
        metrics['completeness'],
        metrics['physical_consistency']
    ])

    return metrics

Implementation Details

Caching Strategy

To minimize API calls and improve performance, atmospheric data is cached intelligently:

class AtmosphericCache:
    def __init__(self, cache_dir='data/atmospheric_cache'):
        self.cache_dir = cache_dir
        self.cache_duration = {
            'current': 600,      # 10 minutes
            'forecast': 3600,    # 1 hour
            'gfs': 21600,       # 6 hours
        }

    def get_cached_data(self, source, lat, lon, time):
        """Retrieve cached data if valid"""
        cache_key = self._generate_key(source, lat, lon, time)
        cache_file = os.path.join(self.cache_dir, f"{cache_key}.pkl")

        if os.path.exists(cache_file):
            age = time.time() - os.path.getmtime(cache_file)
            if age < self.cache_duration.get(source, 3600):
                with open(cache_file, 'rb') as f:
                    return pickle.load(f)

        return None

Error Handling and Fallbacks

Robust error handling ensures the simulation continues even when data sources fail:

def get_atmospheric_profile_with_fallback(lat, lon, alt, time):
    """Get atmospheric profile with automatic fallback"""

    errors = []

    # Try each source in priority order
    for source in ['openweather', 'noaa', 'gfs', 'standard']:
        try:
            if source == 'standard':
                # Final fallback - always works
                return standard_atmosphere.get_profile(lat, lon, alt)

            profile = fetch_from_source(source, lat, lon, time)

            # Validate quality
            if profile.quality_score > 0.5:
                return profile
            else:
                errors.append(f"{source}: Low quality score {profile.quality_score}")

        except Exception as e:
            errors.append(f"{source}: {str(e)}")
            continue

    # Log all errors for debugging
    logger.warning(f"All atmospheric sources failed: {errors}")

    # Return standard atmosphere as last resort
    return standard_atmosphere.get_profile(lat, lon, alt)

Performance Optimization

Several techniques optimize atmospheric data processing:

1. Parallel API Requests:

async def fetch_all_sources_parallel(lat, lon, time):
    tasks = [
        fetch_openweather_async(lat, lon, time),
        fetch_noaa_async(lat, lon, time),
        fetch_gfs_async(lat, lon, time)
    ]
    results = await asyncio.gather(*tasks, return_exceptions=True)
    return process_results(results)

2. Spatial Interpolation Grid:

Pre-compute interpolation weights for common trajectories

3. Vertical Level Optimization:

Only compute levels within expected balloon altitude range