How We Calculate Your Estimates: Methodology, Data Sources, and Accuracy
A transparent look at the formulas, climate data, and real-world validation behind every production estimate -- including the honest caveats about what we can and cannot model.
Why Transparency Matters
Every number you see on this site is a modeled estimate, not a measurement. We take your location, your system configuration, and a large climate dataset, then apply well-established physics to predict how much energy the system would produce over a typical year. This page explains exactly how we do it, where the data comes from, how we validated the math against real-world plant data, and -- most importantly -- where the limits of this approach lie.
The TL;DR
Typical accuracy is within ±10-15% of measured annual output for a well-installed system in a location with reliable climate data. Our wind formula is validated to within ~1% bias against real SCADA data. Our solar formula has a residual ~13% underestimation bias against one validation dataset, most of which is methodological rather than a formula bug.
Data Sources
We rely on two primary external data sources, both free and well-established in the renewable energy industry.
NASA POWER (climate input)
When you pick a location, we query the NASA POWER project for long-term monthly averages at that latitude and longitude. POWER aggregates over 40 years of satellite and reanalysis data and is widely used by researchers and PV planners. Specifically, we pull:
- ALLSKY_SFC_SW_DWN -- all-sky surface shortwave downward irradiance (kWh/m²/day), i.e. Peak Sun Hours
- T2M / T2M_MAX / T2M_MIN -- air temperature at 2m (°C)
- WS2M -- wind speed at 2m above surface (m/s)
What this means for you
POWER data is long-term climatology, not a weather forecast. It's great for annual averages but will not capture any given year's anomalies -- a cloudy summer, an unusually windy winter, or a drought. A real system will deviate from these averages year-to-year.
Kaggle SCADA Datasets (validation)
To check whether our math actually predicts reality, we compare our formulas against published SCADA (supervisory control and data acquisition) recordings from real operating plants:
- Wind Turbine SCADA Dataset -- 10-minute interval readings from a 3.6 MW onshore turbine, including measured wind speed, active power output, and the manufacturer's theoretical power curve.
- Solar Power Generation Data (Plant 1) -- 15-minute interval readings of DC and AC power from an operating Indian solar plant, plus on-site irradiance and module temperature sensors.
Solar Calculations
Our solar production model follows the NREL PVWatts methodology, which is the industry-standard tool for modeled photovoltaic production. The core formula is:
“daily_energy = P_rated × PSH × tiltCorrection × azimuthCorrection × tempDerate × shadingFactor × (1 - systemLosses)”
Each term addresses a specific real-world loss mechanism:
Solar Formula Components
| Term | What it represents | Typical value |
|---|---|---|
| P_rated | System nameplate DC capacity | panelWattage × panelCount ÷ 1000 (kW) |
| PSH | Peak Sun Hours from NASA POWER | 3-7 kWh/m²/day depending on latitude |
| tiltCorrection | Penalty for non-optimal tilt angle | 1.0 at latitude × 0.87, scales down quadratically |
| azimuthCorrection | Penalty for non-south-facing arrays | cos(azimuth_deviation), floored at 0.3 |
| tempDerate | Loss from panels getting hot | 0.88-1.05 (cold climates can exceed 1.0) |
| shadingFactor | Fraction of array not shaded | 0.85-1.0 typical |
| systemLosses | Soiling, wiring, inverter, mismatch | 14% default (PVWatts convention) |
Temperature Derating (The NOCT Model)
Solar panels lose efficiency when hot -- typically around -0.35% per °C above the 25°C rating temperature. We estimate cell temperature using the NOCT (Nominal Operating Cell Temperature) model:
“cellTemp = ambient + (NOCT - 20) × (irradiance / 0.8)”
With a typical NOCT of 45°C this simplifies to cellTemp = ambient + 25 × (irradiance / 0.8). We deliberately scale the cell heating with actual irradiance rather than applying a flat offset -- a subtle point that makes a meaningful difference in cold climates where panels stay cool and can actually outperform their STC rating.
Solar Validation Results
We tested four models against 1,634 matched daytime data points from the Kaggle Plant 1 dataset to isolate where our formula's errors come from:
Solar Model Comparison (vs measured DC output)
| Model | Bias | What it tests |
|---|---|---|
| Production formula (irr-scaled NOCT, 14% losses) | -12.9% | The model this site actually uses |
| Same + DC-only losses (5%) | -5.3% | Removes inverter loss to compare against DC data |
| Ground truth (measured module temp, 5% losses) | -12.4% | Isolates loss model from temperature model |
| Ideal (measured temp, zero losses) | -7.8% | Pure physics, no derating |
What this tells us
The residual -7.8% bias in the ideal model (pure physics, zero losses) means most of the gap isn't a formula error -- it's methodological. Plant 1 reports DC power against an inferred nameplate (the top 1% of observed output at high irradiance), and its AC data has a known reporting quirk. After accounting for this, our formula agrees with real plant behavior to within a few percentage points.
Wind Calculations
Wind is trickier than solar because the power available in wind scales with the cube of wind speed -- a 10% error in wind speed becomes a 33% error in predicted power. We take this seriously with a multi-step adjustment chain.
The Wind Speed Adjustment Chain
- Start with NASA POWER's WS2M (wind speed at 2 meters above ground, approximating open terrain).
- Apply terrain correction -- urban/suburban areas have much lower near-surface wind than open farmland, even if the 2m reading is identical. We reduce the NASA number by up to 65% in dense urban areas.
- Extrapolate to hub height using the power law V_hub = V_ref × (h_hub / h_ref)^α, where α is the wind shear exponent (0.14 for open, up to 0.35 for urban).
- Apply the turbine power curve: cubic law below rated speed, capped at nameplate above rated speed, zero outside the cut-in/cut-out range.
“P = 0.5 × ρ × A × v³ × Cp × η_generator × η_system”
Where ρ is air density (adjusted for elevation), A is the rotor swept area, v is wind speed at hub height, Cp ≈ 0.40 is the rotor aerodynamic coefficient (well below the 0.593 Betz limit), and the generator and system efficiencies (0.9 × 0.85 ≈ 0.77) cover electrical conversion losses.
Wind Validation Results
We compared our power curve against every 10-minute SCADA reading from a real 3.6 MW onshore turbine across the 3-25 m/s operating range, covering tens of thousands of data points:
Wind Formula Overall Accuracy
| Metric | Result | Interpretation |
|---|---|---|
| Overall bias | -0.7% | Essentially unbiased across the full dataset |
| RMSE (mid-range winds) | < 3% of manufacturer's theoretical curve | Tracks the real power curve very closely |
| Low-wind bins (3-4 m/s) | Over-predicts by ~200% relative | Small absolute error: ~60 kW on a 3600 kW turbine |
Why the low-wind 'over-prediction' isn't a problem
At 3-4 m/s the absolute error is tiny (under 2% of rated capacity) and turbines spend very little time in that speed range, so this bin contributes less than 2% to annual energy. Fixing it would require tuning Cp per turbine, which would actually reduce accuracy across the rest of the operating curve.
What We Cannot Model
Being transparent about our limits is as important as being transparent about our methods. The following factors genuinely affect real-world production but cannot be predicted from the inputs you provide:
- Soiling dynamics -- dust, pollen, and bird droppings accumulate gradually and are washed off unpredictably by rain. We assume a constant 2% soiling loss.
- Panel and turbine degradation over time -- real systems lose ~0.5%/year for solar and more variable amounts for wind. We estimate year-one production.
- Inverter clipping -- if your DC/AC ratio exceeds your inverter's capacity, peak output is clipped. Our model does not distinguish DC vs AC capacity.
- Microclimate -- a coastal cliff, a valley floor, or a rooftop heat island can deviate significantly from the nearest NASA grid point (~50 km resolution).
- Shading from neighbors, trees, chimneys -- we use a single flat shading factor; real shading is directional and time-of-day dependent.
- Net metering, feed-in tariffs, time-of-use rates -- financial calculations assume flat electricity rates unless you configure otherwise.
- Year-to-year weather variation -- a single bad year can produce 15-20% less than the climatological average.
Intended Use
Planning tool, not a guarantee
This site is designed to help you compare technologies, understand trade-offs, and make informed early-stage decisions about renewable energy investments. It is not a substitute for a site-specific professional assessment before you commit capital. For any installation worth more than a few thousand dollars, get a certified installer to model your specific roof, land, or home with commercial tools like NREL SAM or PVsyst.
If you want to reproduce or audit any of our numbers, the full formulas live in src/lib/solar-calculations.ts and src/lib/wind-calculations.ts. The validation scripts that produced the bias numbers above are in scripts/validate/. We welcome corrections, better datasets, and improved physics -- renewable energy planning works better when the math is open.