Your ERP Already Has Most of the Data You Need for CSRD

David Osei July 28, 2025 8 min read

Abstract visualization of ERP system data flowing into an ESG reporting platform

There is a standard failure mode in CSRD data collection projects. A consultant sends a spreadsheet template with column headers for each GHG Protocol category. Your finance team spends two weeks manually exporting GL data from the ERP, normalizing vendor codes, and pasting it into the template. The consultant maps expense codes to categories, applies emission factors in a separate worksheet tab, and produces a draft Scope 1-3 inventory. Three months later, the disclosure document lands with a methodology appendix noting "spend-based estimates used due to limited data availability" — for categories where precise activity data was sitting in the ERP the entire time.

Most ESG platforms in the market replicate this workflow at the software layer. They accept CSV uploads. They provide mapping templates. They host the spreadsheet in a web interface. The fundamental problem — disconnection between the ERP as the source of truth and the ESG disclosure as the output — remains unaddressed. The connector architecture is what changes this.

What an ERP actually contains for CSRD purposes

A modern ERP — SAP S/4HANA, Oracle Fusion Cloud, NetSuite — is a structured transactional record of nearly every emission-generating activity in your company's operations. This is not a metaphor. The data is there. The question is whether it is being read in a format that serves environmental disclosure.

Here is a concrete mapping of ERP data modules to GHG Protocol categories:

General Ledger (FI-GL in SAP terms). Every transaction is coded to a cost center and a GL account. From GL account categorization, you can reliably identify: energy costs (maps to Scope 1 combustion and Scope 2 electricity), transport costs (Scope 3 Categories 4 and 9), capital expenditure (Scope 3 Category 2), and business services spend (Scope 3 Category 1 partial). The cost center dimension gives you the site-level granularity needed for location-based Scope 2 calculations.

Purchase Orders and Vendor Master (MM module). PO data contains vendor identity, commodity code, quantity, and unit price. In SAP environments, the vendor master record often contains the vendor's industry classification — a NACE code or equivalent — which is the key input to spend-based Category 1 emission factor application. Procurement data is also the source for identifying which suppliers you need to engage for primary emission factor data in future reporting cycles.

Accounts Payable Invoices (FI-AP). Utility invoices processed through AP contain line-item data: electricity in kWh, natural gas in m³, fuel oil in liters. This is activity-data-level precision — not a spend-based approximation. A utility invoice in SAP that records 42,300 kWh for a specific cost center in a specific billing month is exactly the input needed for a Scope 2 location-based calculation. No proxy method required.

Fixed Asset Register (FI-AA). CapEx by asset category — the data for Scope 3 Category 2 (capital goods). Accumulated depreciation by asset class can also support a capital goods emission estimate that normalizes across asset lifetimes rather than booking a spike in the purchase year.

What the ERP does not contain — and why this matters for methodology

Being precise about ERP data boundaries is the difference between an ESRS-defensible methodology and a gap-ridden disclosure. The following Scope 3 categories require data sources outside the ERP, and documenting this explicitly is not a weakness — it is a required part of the ESRS E1 methodology disclosure.

Category 7 (employee commuting). Commuting patterns are not captured in financial systems. The standard approach is a workforce commuting survey administered once per reporting year, collecting transport mode, distance, and frequency per employee or per site. Payroll location data can support a proxy calculation in the absence of a survey, but the assumptions required (average commute distance by postcode area, transport mode split by urban/rural) must be documented and are typically lower confidence than survey data.

Category 9 (downstream transportation and distribution). Outbound freight emissions require shipment-level data: weight, distance, and transport mode. This data lives in a transportation management system (TMS) or is available from 3PL carrier invoice data. ERP sales order data captures destination and order quantity but rarely captures carrier identity or transport mode in a form suitable for emission calculation without additional mapping.

Category 11 (use of sold products). For manufacturers of energy-consuming products — industrial pumps, HVAC equipment, compressors, electric motors — use-phase emissions are typically the largest Scope 3 category by a significant margin, often 70-90% of total value chain emissions. This data requires engineering product specifications: rated power consumption, expected annual operating hours, expected product lifetime. None of this is in the ERP. It requires a structured data collection exercise from product engineering teams, product documentation, or published test data.

The Greenopsiq gap analysis dashboard maps each GHG Protocol category to its data source, flags categories where ERP data provides activity-level precision, and explicitly marks categories requiring supplementary data with the expected data collection approach. Your assurance provider sees the full picture — not just the categories where data quality is high.

The OAuth connector architecture

The connection between a compliance platform and an enterprise ERP is a security-sensitive integration. ERP systems contain your full procurement history, vendor relationships, and financial transaction data. The architecture needs to be read-only, scope-limited, and auditable from the ERP's access control system.

For SAP S/4HANA, the Greenopsiq connector uses OAuth 2.0 authorization code flow with explicit read-only scope grants. The specific SAP APIs we consume are:

sap.fi.general-ledger.read — GL line items, cost center assignments, document references
sap.mm.purchasing.read — Purchase orders, vendor master, goods receipts
sap.fi.accounts-payable.read — Invoice documents, payment details
sap.fi.fixed-assets.read — Asset master, acquisition and retirement transactions

No write scopes. No access to HR, payroll, CRM, or production planning modules. The OAuth client is registered in SAP Identity Authentication Service (IAS) by your SAP Basis administrator and can be revoked at any time from the SAP side without requiring Greenopsiq involvement.

The setup sequence on connection day: your Basis admin registers the OAuth client, configures the scope list, and exports the client credentials via our encrypted credential delivery portal. This takes 45-90 minutes for an experienced SAP admin. No SAP customization or ABAP development required.

For Oracle Fusion Cloud, the equivalent uses Oracle REST APIs with IDCS (Identity and Access Management Cloud Service) OAuth scopes. NetSuite uses SuiteQL via the REST Record API with a custom role scoped to the required record types. Microsoft Dynamics 365 uses the Dataverse Web API with read-only application user permissions.

A concrete example: a mid-size specialty chemicals manufacturer with SAP S/4HANA 2022 and approximately 180,000 GL line items per year. OAuth setup took one afternoon. The initial full-year data pull — pulling FY2024 GL, PO, AP, and FA data — completed in 3.5 hours. The classification engine processed all transactions against vendor taxonomy and GHG category mapping. Items with classification confidence below 0.75 (approximately 8% of spend volume by value) were flagged for one-click human review in the dashboard. The Scope 1, Scope 2, and Category 1 Scope 3 estimates were available for review by the following morning.

Data quality and the audit trail problem

The audit trail is where manual-process ESG reporting breaks down most visibly for assurance providers. ISAE 3000 (the standard for limited assurance on sustainability disclosures) requires the assurance provider to be able to trace from disclosed figure back to underlying source data. A spreadsheet where data was exported from SAP, reformatted by a consultant, and emission factors were applied in a pivot table does not provide this traceability.

An ERP connector produces traceability by construction. Every emission record in the output carries:

Source document reference: GL posting number, AP invoice number, or PO reference — the assurance provider can go look up the original SAP document
Emission factor version pin: e.g., "IPCC AR6 Table A.III.2, fugitive emissions factor for natural gas distribution networks, version 2021" — reproducible, versionable, not dependent on a consultant's factor file
Classification confidence score: the model's confidence in the GHG category assignment for each transaction, plus override history if a human reclassified a line item
Calculation formula trace: activity data (kWh or kg or €) × emission factor (tCO2e per unit) = tCO2e result — explicit, not implicit

This is the difference between a disclosure that passes limited assurance review in one round and one that requires three rounds of evidence requests. We are not suggesting that manual processes cannot eventually produce acceptable audit trails — they can. But the marginal cost of maintaining that trail in a spreadsheet, updated across multiple analyst hands over multiple years, accumulates in ways that ERP-sourced data does not.

Why the connector architecture matters beyond year one

Year-one CSRD disclosure is a one-time project. Year-two through year-ten disclosure is an ongoing operational process. The manual extraction model that works (barely) for the first disclosure year does not scale. Each subsequent year requires re-running the same extraction and mapping workflow, maintaining factor version updates, and extending the process as the company's ERP changes and as ESRS data requirements evolve.

A live ERP connector, by contrast, provides a continuously refreshed data feed. Nightly data pulls mean that mid-year estimates are available for board reporting and internal tracking without a new project. When the company acquires a new subsidiary in Q3, the new entity's ERP data is included in the next pull. When IPCC updates emission factors, the factor library updates and the historical comparison is recalculated without manual rework.

If your team is still debating whether to automate the ERP connection or continue with annual CSV exports, the full technical architecture page walks through the pipeline in detail. The decision point is not really about year one — it is about whether you want to rebuild the same manual process every twelve months for the foreseeable future.

What an ERP actually contains for CSRD purposes

What the ERP does not contain — and why this matters for methodology

The OAuth connector architecture

Data quality and the audit trail problem

Why the connector architecture matters beyond year one

More from the blog

CSRD Scope 3 Reporting: A Practical Guide for Mid-Market Companies

Market-Based vs. Location-Based Scope 2: Which Does CSRD Require?