What is a Financial Data Pack?
A financial data pack (or data book) is a standardized Excel workbook that organizes a company’s financial and operational data into a consistent, analysis-ready format. It is a foundational deliverable in private equity, investment banking, and asset management — used for M&A due diligence, investment committee presentations, and portfolio company monitoring. The concept is straightforward: financial data comes from many sources (CIMs, SEC filings, management presentations, industry reports) in inconsistent formats. A data pack normalizes this information into a standard 8-tab structure that any financial professional can immediately navigate: Executive Summary, Historical Financials, Balance Sheet, Cash Flow, Operating Metrics, Segment Performance, Market Analysis, and Investment Highlights. Data packs are critical in private equity, where firms evaluate hundreds of potential acquisitions per year. A standardized format allows investment committee members to quickly compare opportunities and make informed decisions. At firms like KKR, Blackstone, Apollo, and Carlyle, data packs are the starting point for every investment discussion.Why It Matters
- Standardization enables comparison: When every deal is presented in the same format, investment committee members can quickly compare opportunities
- Data accuracy is non-negotiable: Financial models built on incorrect data produce incorrect valuations. The data pack is where accuracy is first established
- Due diligence efficiency: A well-organized data pack accelerates diligence by answering common questions upfront
- Institutional memory: Data packs from deals that did not close remain valuable references for future opportunities
Key Concepts
| Term | Definition |
|---|---|
| EBITDA | Earnings Before Interest, Taxes, Depreciation, and Amortization — the standard profitability metric |
| Adjusted EBITDA | EBITDA with normalization adjustments for non-recurring items, owner compensation, and pro forma items |
| Add-back | An adjustment that increases EBITDA (e.g., removing a one-time restructuring charge) |
| Quality of Earnings (QoE) | Independent accountant’s report validating adjusted EBITDA |
| Blue/Black/Green Font Convention | Blue = hardcoded inputs, Black = formulas, Green = links to other sheets |
| Bridge Analysis | A walk from reported EBITDA to adjusted EBITDA, showing each add-back |
| Investment Committee (IC) | The senior group at a PE firm that approves investment decisions |
Worked Example: Data Pack for a Healthcare Services Company
Walk through building the data pack for a PE firm evaluating an acquisition of a home healthcare company with $120M revenue.Tab 1: Executive Summary
Tab 2: Historical Financials (Income Statement)
- Blue text: All revenue figures, COGS, opex (hardcoded from source)
- Black text: Gross Profit, EBITDA, margins (formulas)
- Green text: Any cross-references to other tabs
Tab 5: Operating Metrics (Critical Formatting)
Scenario Building (Tab within Financial Model)
Full Skill Workflow (From SKILL.md)
Phase 1: Document Processing and Data Extraction
- Analyze source materials (CIM, SEC filings, management presentations, QoE report)
- Locate 3-5 years of historical financial statements
- Identify management projections if included
- Note fiscal year end date
- Extract: income statement, balance sheet, cash flow, operating metrics, market data
- Note all page references for traceability
Phase 2: Data Normalization and Standardization
- Ensure consistent line item names across all years
- Identify one-time charges and document each
- Create Adjusted EBITDA reconciliation
- Apply format detection logic:
- Revenue, EBITDA, costs = Currency format ($#,##0.0)
- Locations, patients, employees = Number format (#,##0, NO $)
- Margins, growth rates = Percentage format (0.0%)
- Years = Text format (prevent “2,024” display)
- Verify data integrity: subtotals sum correctly, balance sheet balances
Phase 3: Build Excel Workbook (8 Tabs)
| Tab | Purpose | Key Formatting |
|---|---|---|
| 1. Executive Summary | One-page overview | Bold headers, key metrics prominent |
| 2. Historical Financials | Complete P&L | Currency format, single/double underlines |
| 3. Balance Sheet | Full BS | Must balance: A = L + E |
| 4. Cash Flow Statement | Cash flows | Link to IS and BS |
| 5. Operating Metrics | Non-financial KPIs | NO dollar signs on counts |
| 6. Segment Performance | Revenue by segment | Consistent with Tab 2 totals |
| 7. Market Analysis | Industry context | Source citations |
| 8. Investment Highlights | Narrative | Bullets, concise paragraphs |
- Blue font (0,0,255): All hardcoded inputs
- Black font (0,0,0): All formulas and calculations
- Green font (0,128,0): Links to other sheets
- Negatives in parentheses: 123
- Years as text: 2024 not 2,024
- Headers bold, left-aligned
- Numbers right-aligned
- Single underline above subtotals
- Double underline below final totals
- Freeze panes on headers
Phase 4: Scenario Building (if projections included)
Management Case: Present company’s projections as provided. Base Case (Risk-Adjusted): Apply conservative haircuts:- Revenue growth haircut reflecting execution risk
- Moderate margin expansion assumptions
- Increase capex if growth-dependent
- Document all adjustments with rationale
- Model revenue decline
- Assume margin compression
- Test covenant compliance and liquidity
Phase 5: Quality Control and Validation
Data accuracy checks:- Every number traces to source document
- All calculations are formula-based (no hardcoded values)
- Subtotals and totals are correct
- Years display without commas (2024 not 2,024)
- No #REF!, #VALUE!, or #DIV/0! errors
- Financial data has $ signs
- Operational data has NO $ signs
- Percentages display as % (15.0% not 0.15)
- Negatives in parentheses
- Headers bold and left-aligned
- Numbers right-aligned
- All 8 tabs present
- Balance sheet balances (A = L + E)
- Cash flow ties to BS changes
- Tab 6 segment totals match Tab 2 revenue
Phase 6: Final Delivery
Save as:CompanyName_DataPack_YYYY-MM-DD.xlsx
Industry-Specific Adaptations
Technology / SaaS
Technology / SaaS
Key metrics for Tab 5:
- ARR (Annual Recurring Revenue) and MRR — currency format ($)
- Customer count by cohort — number format (no $)
- CAC (Customer Acquisition Cost) — currency format ($)
- LTV (Lifetime Value) — currency format ($)
- Churn rate (gross and net) — percentage format (%)
- Net revenue retention — percentage format (%)
- Rule of 40 (Growth % + EBITDA Margin %) — percentage (%)
- Magic number (sales efficiency) — decimal (x.xx)
Manufacturing / Industrial
Manufacturing / Industrial
Key metrics for Tab 5:
- Production capacity — units (no $)
- Capacity utilization — percentage (%)
- Units produced by product line — units (no $)
- Inventory turns — ratio (x.x)
- Gross margin by product line — percentage (%)
- Order backlog — currency ($)
- Headcount — number (no $)
Real Estate / Hospitality
Real Estate / Hospitality
Key metrics for Tab 5:
- Properties/rooms/square footage — number (no $)
- Occupancy rates — percentage (%)
- ADR (Average Daily Rate) — currency ($)
- RevPAR (Revenue per Available Room) — currency ($)
- NOI (Net Operating Income) — currency ($)
- Cap rates — percentage (%)
- FF&E reserve — currency ($)
Healthcare / Services
Healthcare / Services
Key metrics for Tab 5:
- Locations/facilities — number (no $)
- Providers/employees — number (no $)
- Patients/visits — number (no $)
- Revenue per visit — currency ($)
- Payor mix — percentage (%)
- Same-store growth — percentage (%)
- Readmission rate — percentage (%)
Common Mistakes (and How to Avoid Them)
Mistake 1: Dollar Signs on Operational Metrics
Mistake 1: Dollar Signs on Operational Metrics
What goes wrong: The Operating Metrics tab shows “Stores: 2,500.” Dollar signs on non-financial data is immediately noticed by any PE professional and signals amateur work.How to avoid it: Apply the format rule: Financial data (measuring money) gets 4,300K.
Mistake 2: Years Displaying as '2,024'
Mistake 2: Years Displaying as '2,024'
What goes wrong: Excel interprets year labels as numbers and adds comma formatting. Column headers show “2,024” instead of “2024”. This looks terrible and confuses readers.How to avoid it: Format year cells as text or use a custom format that prevents commas. Enter years with an apostrophe prefix (‘2024) to force text format. Check every tab header.
Mistake 3: Hardcoded Calculations
Mistake 3: Hardcoded Calculations
What goes wrong: Gross margin is typed as “45.0%” instead of calculated as =Gross_Profit/Revenue. When a revenue figure is corrected later, the margin does not update. The data pack contains internally inconsistent numbers.How to avoid it: Every subtotal, total, ratio, and derived metric must be formula-based. Track row numbers when building the model so formulas reference the correct cells. Test by changing an input and verifying that all downstream calculations update.
Mistake 4: Balance Sheet Does Not Balance
Mistake 4: Balance Sheet Does Not Balance
What goes wrong: Total Assets = 82M. A $3M gap. The fundamental accounting identity is violated. This error destroys credibility with any financial professional.How to avoid it: Build a balance check formula (=Assets - Liabilities - Equity) on every tab containing balance sheet data. This cell must equal zero for every period. If it does not, find and fix the error before delivery.
Mistake 5: Recurring 'Non-Recurring' Items
Mistake 5: Recurring 'Non-Recurring' Items
What goes wrong: The EBITDA adjustments include “restructuring charges” in FY2022, FY2023, and FY2024. If the company restructures every year, these charges are not truly non-recurring. Adding them back inflates adjusted EBITDA and misleads the investment committee.How to avoid it: For each add-back, ask: “Has this occurred in multiple years?” If yes, it is likely a recurring cost of doing business and should not be added back. Document the recurrence pattern and flag it for the IC.
Mistake 6: Missing Source Citations
Mistake 6: Missing Source Citations
What goes wrong: The data pack contains hundreds of data points with no indication of where they came from. An IC member asks “where did you get the $120M revenue figure?” and the analyst cannot trace it to a specific document.How to avoid it: Add cell comments or a separate “Sources” column referencing specific documents and page numbers. For web-sourced data, include URLs. For MCP-sourced data, include the query and timestamp.
Mistake 7: Inconsistent Line Item Names
Mistake 7: Inconsistent Line Item Names
What goes wrong: Tab 2 shows “Cost of Goods Sold” in FY2022 but “Cost of Revenue” in FY2023 and “COGS” in FY2024. Same concept, three different names. Formulas may not reference correctly, and the tab is hard to read.How to avoid it: Standardize all line item names across all years during the normalization phase. Pick one term and use it consistently. Document any name changes from the source materials.
Mistake 8: Cluttered Formatting
Mistake 8: Cluttered Formatting
What goes wrong: The workbook has heavy borders on every cell, multiple font sizes, colorful fill colors, and merged cells everywhere. It looks like a corporate dashboard, not a PE data pack.How to avoid it: PE data packs use minimal formatting: no cell borders (or very light ones), consistent font (Calibri or Arial 11pt), white backgrounds, and the blue/black/green font convention. Less is more.
Mistake 9: No Scenario Analysis
Mistake 9: No Scenario Analysis
What goes wrong: The data pack includes management’s projections as the only forward view. The IC asks “what happens if growth slows to 5%?” and there is no analysis prepared.How to avoid it: Always include at least a Management Case and a Base Case (risk-adjusted). For LBO analysis, add a Downside Case with stress-tested revenue and margins. Document the assumptions for each scenario.
Mistake 10: Delivering an Incomplete Data Pack
Mistake 10: Delivering an Incomplete Data Pack
What goes wrong: The data pack has 6 of 8 tabs. Operating Metrics and Market Analysis are missing. The IC cannot compare this deal to prior investments because the format is inconsistent.How to avoid it: All 8 tabs must be present, even if some have limited data. If market analysis data is unavailable, note “Market data to be sourced during diligence” on the tab rather than omitting it entirely.
Daily Workflow Scenarios
Scenario 1: Building a Data Pack from a CIM
Context: A PE firm received a CIM for a potential acquisition. The associate needs to extract data into a standard data pack. Day 1 (4-5 hours): Phase 1-2. Read the CIM. Extract all financial data, operating metrics, and market data. Normalize line items. Day 2 (4-5 hours): Phase 3. Build the 8-tab Excel workbook. Enter data with proper formatting. Build formulas for all calculated metrics. Day 3 (2-3 hours): Phase 4-5. Build scenarios. Quality check. Verify balances, formulas, and formatting. Deliver.Scenario 2: Building a Data Pack from SEC Filings
Context: No CIM available. Build the data pack from public SEC filings (10-K, 10-Q). Day 1 (3-4 hours): Extract 3-5 years of financial statements from 10-K filings (EDGAR). Standardize line items across years. Day 2 (3-4 hours): Build the workbook. Add operating metrics from MD&A sections. Research market data from industry reports. Day 3 (2-3 hours): Quality check and deliver.Scenario 3: Updating a Data Pack After New Quarter
Context: The company reported Q3 results. Update the existing data pack with the latest quarter. Timeline (2-3 hours):- Extract Q3 financials from 10-Q or earnings release
- Add new column to each financial tab
- Update operating metrics
- Update LTM (Last Twelve Months) calculations
- Verify all formulas still work with the new column
- Update Executive Summary with latest figures
Practice Exercise
Exercise: Build a Data Pack for a Restaurant Chain A PE firm is evaluating the acquisition of a fast-casual restaurant chain with 45 locations. Financial data (provided):| Metric | FY2022 | FY2023 | FY2024 |
|---|---|---|---|
| Revenue | $52M | $61M | $72M |
| COGS (food + labor) | $33.8M | $39.0M | $45.4M |
| Rent & Occupancy | $5.2M | $6.1M | $7.2M |
| Other Opex | $7.8M | $8.5M | $9.0M |
| D&A | $2.6M | $3.0M | $3.6M |
| Locations | 35 | 40 | 45 |
| Average Unit Volume | $1.49M | $1.53M | $1.60M |
| Same-Store Sales Growth | 8% | 5% | 7% |
| Ticket Size | $14.50 | $15.20 | $15.80 |
| Daily Transactions/Store | 280 | 275 | 278 |
How to Add to Your Local Context
Best Practices
- Zero tolerance for data errors: Trace every number to its source with a page reference
- Formula-based calculations only: Never hardcode a calculated value
- Verify the balance sheet balances: A = L + E for every period
- Normalization transparency: Every add-back must be documented with dollar impact and rationale
- Be skeptical of recurring “non-recurring” items: Annual restructuring charges are recurring
- Test formulas before delivery: Spot-check calculated values manually
- Follow the blue/black/green convention: It is an industry standard that enables instant recognition
Dependencies
Required:- XLSX skill for Excel workbook creation
- Source financial data (CIM, SEC filings, or management materials)
- SEC EDGAR for public company filings
- Industry databases for market analysis data
- QoE report for validated EBITDA adjustments