Skip to main content

What is a Financial Data Pack?

A financial data pack (or data book) is a standardized Excel workbook that organizes a company’s financial and operational data into a consistent, analysis-ready format. It is a foundational deliverable in private equity, investment banking, and asset management — used for M&A due diligence, investment committee presentations, and portfolio company monitoring. The concept is straightforward: financial data comes from many sources (CIMs, SEC filings, management presentations, industry reports) in inconsistent formats. A data pack normalizes this information into a standard 8-tab structure that any financial professional can immediately navigate: Executive Summary, Historical Financials, Balance Sheet, Cash Flow, Operating Metrics, Segment Performance, Market Analysis, and Investment Highlights. Data packs are critical in private equity, where firms evaluate hundreds of potential acquisitions per year. A standardized format allows investment committee members to quickly compare opportunities and make informed decisions. At firms like KKR, Blackstone, Apollo, and Carlyle, data packs are the starting point for every investment discussion.

Why It Matters

  • Standardization enables comparison: When every deal is presented in the same format, investment committee members can quickly compare opportunities
  • Data accuracy is non-negotiable: Financial models built on incorrect data produce incorrect valuations. The data pack is where accuracy is first established
  • Due diligence efficiency: A well-organized data pack accelerates diligence by answering common questions upfront
  • Institutional memory: Data packs from deals that did not close remain valuable references for future opportunities

Key Concepts

TermDefinition
EBITDAEarnings Before Interest, Taxes, Depreciation, and Amortization — the standard profitability metric
Adjusted EBITDAEBITDA with normalization adjustments for non-recurring items, owner compensation, and pro forma items
Add-backAn adjustment that increases EBITDA (e.g., removing a one-time restructuring charge)
Quality of Earnings (QoE)Independent accountant’s report validating adjusted EBITDA
Blue/Black/Green Font ConventionBlue = hardcoded inputs, Black = formulas, Green = links to other sheets
Bridge AnalysisA walk from reported EBITDA to adjusted EBITDA, showing each add-back
Investment Committee (IC)The senior group at a PE firm that approves investment decisions

Worked Example: Data Pack for a Healthcare Services Company

Walk through building the data pack for a PE firm evaluating an acquisition of a home healthcare company with $120M revenue.

Tab 1: Executive Summary

COMPANY OVERVIEW
HomeWell Health Services ("HomeWell") is a provider of in-home
nursing and rehabilitation services operating 28 locations across
the Southeast United States. Founded in 2011, the company serves
~8,500 active patients through a network of 1,200+ healthcare
professionals (nurses, physical therapists, occupational therapists).

KEY INVESTMENT HIGHLIGHTS
1. Aging demographics create durable, secular demand tailwind
2. #3 position in Southeast US with 6% regional market share
3. Revenue CAGR of 18% (FY2022-FY2024) driven by same-store
   growth (12%) and new location openings (6%)
4. Adjusted EBITDA margin expanded from 12% to 16% over 3 years
5. 92% Medicare/Medicaid payor mix provides predictable revenue

FINANCIAL SNAPSHOT
| Metric              | FY2022 | FY2023 | FY2024 | FY2025E |
|---------------------|--------|--------|--------|---------|
| Revenue ($M)        | $86.0  | $102.0 | $120.0 | $138.0  |
| Revenue Growth      | 15.0%  | 18.6%  | 17.6%  | 15.0%   |
| Adj. EBITDA ($M)    | $10.3  | $14.3  | $19.2  | $23.5   |
| Adj. EBITDA Margin  | 12.0%  | 14.0%  | 16.0%  | 17.0%   |
| Locations           | 20     | 24     | 28     | 32      |
| Active Patients     | 5,800  | 7,200  | 8,500  | 9,800   |

TRANSACTION OVERVIEW
Expected EV: $170-200M (9-10x Adjusted EBITDA)
Seller: Founder-owned; founder willing to stay 2 years
Structure: 100% acquisition; debt-financed LBO

Tab 2: Historical Financials (Income Statement)

HOMEWELL HEALTH SERVICES -- INCOME STATEMENT ($M)

                              FY2022   FY2023   FY2024
Revenue
  Home Nursing Services       $58.0    $70.0    $82.0
  Physical Therapy             18.0     21.0     25.0
  Occupational Therapy          7.0      8.0     10.0
  Speech Therapy                3.0      3.0      3.0
Total Revenue                 $86.0   $102.0   $120.0

Cost of Revenue
  Clinical Staff Wages        ($43.0)  ($49.0)  ($55.2)
  Clinical Supplies            (4.3)    (5.1)    (6.0)
  Vehicle/Travel Costs         (3.4)    (4.1)    (4.8)
Total COGS                   ($50.7)  ($58.2)  ($66.0)

Gross Profit                  $35.3    $43.8    $54.0
  Gross Margin                 41.0%    42.9%    45.0%

Operating Expenses
  SG&A                       ($15.5)  ($17.3)  ($19.2)
  Rent & Occupancy             (3.4)    (4.1)    (4.8)
  Marketing                    (1.7)    (2.0)    (2.4)
  Technology                   (0.9)    (1.0)    (1.2)
Total Opex                   ($21.5)  ($24.4)  ($27.6)

EBITDA                         $9.5    $13.3    $18.0
  D&A                         (2.6)    (3.1)    (3.6)
Operating Income               $6.9    $10.2    $14.4

EBITDA Adjustments
  Owner excess compensation    +$0.3    +$0.3    +$0.3
  Restructuring charges         —       +$0.5     —
  New location startup costs   +$0.5    +$0.2    +$0.9
  Litigation settlement         —        —        —

Adjusted EBITDA               $10.3    $14.3    $19.2
  Adj. EBITDA Margin           12.0%    14.0%    16.0%
Font color coding applied:
  • Blue text: All revenue figures, COGS, opex (hardcoded from source)
  • Black text: Gross Profit, EBITDA, margins (formulas)
  • Green text: Any cross-references to other tabs

Tab 5: Operating Metrics (Critical Formatting)

OPERATING METRICS -- NO DOLLAR SIGNS ON OPERATIONAL DATA

                              FY2022   FY2023   FY2024
SCALE METRICS
  Locations                      20       24       28
  Active Patients             5,800    7,200    8,500
  Healthcare Professionals    850      1,020    1,200
  Patient-to-Staff Ratio      6.8x     7.1x     7.1x

PRODUCTIVITY METRICS
  Revenue per Location ($K)   $4,300   $4,250   $4,286
  Revenue per Patient          $14,828  $14,167  $14,118
  Revenue per Professional    $101,176 $100,000 $100,000
  Visits per Patient/Year       48       50       52

GROWTH METRICS
  Same-Store Revenue Growth    12.0%    11.5%    12.0%
  New Location Revenue Growth   3.0%     7.1%     5.6%
  Patient Growth               15.0%    24.1%    18.1%
  Professional Growth          12.0%    20.0%    17.6%

RETENTION & QUALITY
  Patient Retention Rate       88.0%    89.5%    91.0%
  Professional Retention Rate  78.0%    80.0%    82.0%
  Medicare Star Rating           4.2      4.3      4.5
  Readmission Rate (30-day)    14.0%    13.5%    12.8%
Critical formatting note: Locations (20, 24, 28), Patients (5,800), and Professionals (850) are operational counts — they use number format with commas, NO dollar signs. Revenue per Location uses dollar format because it measures money.

Scenario Building (Tab within Financial Model)

SCENARIO ANALYSIS

                         Management  Base Case   Downside
                         Case        (Risk-Adj)  Case
Revenue Growth (FY25E)   15.0%       12.0%       5.0%
  Rationale             Mgmt plan   Haircut for  Recession;
                                    execution    reimbursement
                                    risk         rate cuts

EBITDA Margin (FY25E)    17.0%       16.0%       13.0%
  Rationale             Continued   Margin       Wage inflation
                        expansion   stable       compresses
                                                 margins

New Locations (FY25E)    4           3            1
  Rationale             Full plan   Moderate     Pause expansion
                                    pace

Revenue ($M)             $138.0     $134.4       $126.0
EBITDA ($M)               $23.5      $21.5        $16.4
EBITDA Multiple (10x)    $235M       $215M        $164M

Key Risk Tested:
  - Downside: Medicare reimbursement rate cut of 3-5%
  - Downside: Nurse wage inflation of 8% (vs. 4% base)
  - Downside: Same-store growth decelerates to 3%

Full Skill Workflow (From SKILL.md)

Phase 1: Document Processing and Data Extraction

  1. Analyze source materials (CIM, SEC filings, management presentations, QoE report)
  2. Locate 3-5 years of historical financial statements
  3. Identify management projections if included
  4. Note fiscal year end date
  5. Extract: income statement, balance sheet, cash flow, operating metrics, market data
  6. Note all page references for traceability

Phase 2: Data Normalization and Standardization

  1. Ensure consistent line item names across all years
  2. Identify one-time charges and document each
  3. Create Adjusted EBITDA reconciliation
  4. Apply format detection logic:
    • Revenue, EBITDA, costs = Currency format ($#,##0.0)
    • Locations, patients, employees = Number format (#,##0, NO $)
    • Margins, growth rates = Percentage format (0.0%)
    • Years = Text format (prevent “2,024” display)
  5. Verify data integrity: subtotals sum correctly, balance sheet balances

Phase 3: Build Excel Workbook (8 Tabs)

TabPurposeKey Formatting
1. Executive SummaryOne-page overviewBold headers, key metrics prominent
2. Historical FinancialsComplete P&LCurrency format, single/double underlines
3. Balance SheetFull BSMust balance: A = L + E
4. Cash Flow StatementCash flowsLink to IS and BS
5. Operating MetricsNon-financial KPIsNO dollar signs on counts
6. Segment PerformanceRevenue by segmentConsistent with Tab 2 totals
7. Market AnalysisIndustry contextSource citations
8. Investment HighlightsNarrativeBullets, concise paragraphs
Mandatory formatting rules:
  • Blue font (0,0,255): All hardcoded inputs
  • Black font (0,0,0): All formulas and calculations
  • Green font (0,128,0): Links to other sheets
  • Negatives in parentheses: (123.0)not(123.0) not -123
  • Years as text: 2024 not 2,024
  • Headers bold, left-aligned
  • Numbers right-aligned
  • Single underline above subtotals
  • Double underline below final totals
  • Freeze panes on headers

Phase 4: Scenario Building (if projections included)

Management Case: Present company’s projections as provided. Base Case (Risk-Adjusted): Apply conservative haircuts:
  • Revenue growth haircut reflecting execution risk
  • Moderate margin expansion assumptions
  • Increase capex if growth-dependent
  • Document all adjustments with rationale
Downside Case: Stress test:
  • Model revenue decline
  • Assume margin compression
  • Test covenant compliance and liquidity

Phase 5: Quality Control and Validation

Data accuracy checks:
  • Every number traces to source document
  • All calculations are formula-based (no hardcoded values)
  • Subtotals and totals are correct
  • Years display without commas (2024 not 2,024)
  • No #REF!, #VALUE!, or #DIV/0! errors
Format consistency checks:
  • Financial data has $ signs
  • Operational data has NO $ signs
  • Percentages display as % (15.0% not 0.15)
  • Negatives in parentheses
  • Headers bold and left-aligned
  • Numbers right-aligned
Structure checks:
  • All 8 tabs present
  • Balance sheet balances (A = L + E)
  • Cash flow ties to BS changes
  • Tab 6 segment totals match Tab 2 revenue

Phase 6: Final Delivery

Save as: CompanyName_DataPack_YYYY-MM-DD.xlsx

Industry-Specific Adaptations

Key metrics for Tab 5:
  • ARR (Annual Recurring Revenue) and MRR — currency format ($)
  • Customer count by cohort — number format (no $)
  • CAC (Customer Acquisition Cost) — currency format ($)
  • LTV (Lifetime Value) — currency format ($)
  • Churn rate (gross and net) — percentage format (%)
  • Net revenue retention — percentage format (%)
  • Rule of 40 (Growth % + EBITDA Margin %) — percentage (%)
  • Magic number (sales efficiency) — decimal (x.xx)
Common pitfall: ARR uses ,customercountdoesnot.Revenuepercustomeruses, customer count does not. Revenue per customer uses , customer count does not. Apply format detection based on whether the metric measures money vs. counting things.
Key metrics for Tab 5:
  • Production capacity — units (no $)
  • Capacity utilization — percentage (%)
  • Units produced by product line — units (no $)
  • Inventory turns — ratio (x.x)
  • Gross margin by product line — percentage (%)
  • Order backlog — currency ($)
  • Headcount — number (no $)
Common pitfall: Backlog is measured in dollars (currency format). Capacity is measured in units (number format). Both appear on the same tab.
Key metrics for Tab 5:
  • Properties/rooms/square footage — number (no $)
  • Occupancy rates — percentage (%)
  • ADR (Average Daily Rate) — currency ($)
  • RevPAR (Revenue per Available Room) — currency ($)
  • NOI (Net Operating Income) — currency ($)
  • Cap rates — percentage (%)
  • FF&E reserve — currency ($)
Common pitfall: Room count is a number. RevPAR is currency. Both describe the same property but use different formats.
Key metrics for Tab 5:
  • Locations/facilities — number (no $)
  • Providers/employees — number (no $)
  • Patients/visits — number (no $)
  • Revenue per visit — currency ($)
  • Payor mix — percentage (%)
  • Same-store growth — percentage (%)
  • Readmission rate — percentage (%)
Common pitfall: Patient count is a number. Revenue per patient is currency. The distinction is critical for IC-ready formatting.

Common Mistakes (and How to Avoid Them)

What goes wrong: The Operating Metrics tab shows “Stores: 120"and"Employees:120" and "Employees: 2,500.” Dollar signs on non-financial data is immediately noticed by any PE professional and signals amateur work.How to avoid it: Apply the format rule: Financial data (measuring money) gets signs.Operationaldata(countingthings)doesnot.Stores:120.Employees:2,500.Revenueperstore:signs. Operational data (counting things) does not. Stores: 120. Employees: 2,500. Revenue per store:4,300K.
What goes wrong: Excel interprets year labels as numbers and adds comma formatting. Column headers show “2,024” instead of “2024”. This looks terrible and confuses readers.How to avoid it: Format year cells as text or use a custom format that prevents commas. Enter years with an apostrophe prefix (‘2024) to force text format. Check every tab header.
What goes wrong: Gross margin is typed as “45.0%” instead of calculated as =Gross_Profit/Revenue. When a revenue figure is corrected later, the margin does not update. The data pack contains internally inconsistent numbers.How to avoid it: Every subtotal, total, ratio, and derived metric must be formula-based. Track row numbers when building the model so formulas reference the correct cells. Test by changing an input and verifying that all downstream calculations update.
What goes wrong: Total Assets = 85M,TotalLiabilities+Equity=85M, Total Liabilities + Equity = 82M. A $3M gap. The fundamental accounting identity is violated. This error destroys credibility with any financial professional.How to avoid it: Build a balance check formula (=Assets - Liabilities - Equity) on every tab containing balance sheet data. This cell must equal zero for every period. If it does not, find and fix the error before delivery.
What goes wrong: The EBITDA adjustments include “restructuring charges” in FY2022, FY2023, and FY2024. If the company restructures every year, these charges are not truly non-recurring. Adding them back inflates adjusted EBITDA and misleads the investment committee.How to avoid it: For each add-back, ask: “Has this occurred in multiple years?” If yes, it is likely a recurring cost of doing business and should not be added back. Document the recurrence pattern and flag it for the IC.
What goes wrong: The data pack contains hundreds of data points with no indication of where they came from. An IC member asks “where did you get the $120M revenue figure?” and the analyst cannot trace it to a specific document.How to avoid it: Add cell comments or a separate “Sources” column referencing specific documents and page numbers. For web-sourced data, include URLs. For MCP-sourced data, include the query and timestamp.
What goes wrong: Tab 2 shows “Cost of Goods Sold” in FY2022 but “Cost of Revenue” in FY2023 and “COGS” in FY2024. Same concept, three different names. Formulas may not reference correctly, and the tab is hard to read.How to avoid it: Standardize all line item names across all years during the normalization phase. Pick one term and use it consistently. Document any name changes from the source materials.
What goes wrong: The workbook has heavy borders on every cell, multiple font sizes, colorful fill colors, and merged cells everywhere. It looks like a corporate dashboard, not a PE data pack.How to avoid it: PE data packs use minimal formatting: no cell borders (or very light ones), consistent font (Calibri or Arial 11pt), white backgrounds, and the blue/black/green font convention. Less is more.
What goes wrong: The data pack includes management’s projections as the only forward view. The IC asks “what happens if growth slows to 5%?” and there is no analysis prepared.How to avoid it: Always include at least a Management Case and a Base Case (risk-adjusted). For LBO analysis, add a Downside Case with stress-tested revenue and margins. Document the assumptions for each scenario.
What goes wrong: The data pack has 6 of 8 tabs. Operating Metrics and Market Analysis are missing. The IC cannot compare this deal to prior investments because the format is inconsistent.How to avoid it: All 8 tabs must be present, even if some have limited data. If market analysis data is unavailable, note “Market data to be sourced during diligence” on the tab rather than omitting it entirely.

Daily Workflow Scenarios

Scenario 1: Building a Data Pack from a CIM

Context: A PE firm received a CIM for a potential acquisition. The associate needs to extract data into a standard data pack. Day 1 (4-5 hours): Phase 1-2. Read the CIM. Extract all financial data, operating metrics, and market data. Normalize line items. Day 2 (4-5 hours): Phase 3. Build the 8-tab Excel workbook. Enter data with proper formatting. Build formulas for all calculated metrics. Day 3 (2-3 hours): Phase 4-5. Build scenarios. Quality check. Verify balances, formulas, and formatting. Deliver.

Scenario 2: Building a Data Pack from SEC Filings

Context: No CIM available. Build the data pack from public SEC filings (10-K, 10-Q). Day 1 (3-4 hours): Extract 3-5 years of financial statements from 10-K filings (EDGAR). Standardize line items across years. Day 2 (3-4 hours): Build the workbook. Add operating metrics from MD&A sections. Research market data from industry reports. Day 3 (2-3 hours): Quality check and deliver.

Scenario 3: Updating a Data Pack After New Quarter

Context: The company reported Q3 results. Update the existing data pack with the latest quarter. Timeline (2-3 hours):
  1. Extract Q3 financials from 10-Q or earnings release
  2. Add new column to each financial tab
  3. Update operating metrics
  4. Update LTM (Last Twelve Months) calculations
  5. Verify all formulas still work with the new column
  6. Update Executive Summary with latest figures

Practice Exercise

Exercise: Build a Data Pack for a Restaurant Chain A PE firm is evaluating the acquisition of a fast-casual restaurant chain with 45 locations. Financial data (provided):
MetricFY2022FY2023FY2024
Revenue$52M$61M$72M
COGS (food + labor)$33.8M$39.0M$45.4M
Rent & Occupancy$5.2M$6.1M$7.2M
Other Opex$7.8M$8.5M$9.0M
D&A$2.6M$3.0M$3.6M
Locations354045
Average Unit Volume$1.49M$1.53M$1.60M
Same-Store Sales Growth8%5%7%
Ticket Size$14.50$15.20$15.80
Daily Transactions/Store280275278
Task 1: Build the income statement tab (Tab 2). Calculate EBITDA and margins. Use proper formatting (currency for financial data, formulas for calculations). Task 2: Build the operating metrics tab (Tab 5). Which metrics get dollar signs and which do not? Format each correctly. Task 3: Create the Adjusted EBITDA bridge assuming: owner excess compensation of 300K,onetimeconsultantfeesof300K, one-time consultant fees of 150K (FY2024 only), and new location pre-opening costs of $200K per new location. Task 4: Build a Base Case scenario for FY2025 with: 4 new locations, 4% same-store sales growth, COGS at 63% of revenue, and occupancy costs growing at 8%. Task 5: What font color would each of the following cells use? (a) Revenue FY2024: $72M (b) Gross Margin: 37.0% (c) Location count from Operating Metrics tab referenced on Executive Summary tab.

How to Add to Your Local Context

claude plugin install investment-banking@financial-services-plugins
{
  "mcpServers": {
    "sec-edgar": {
      "command": "edgar-mcp-server",
      "args": ["--user-agent", "YourFirm research@yourfirm.com"]
    }
  }
}

Best Practices

  • Zero tolerance for data errors: Trace every number to its source with a page reference
  • Formula-based calculations only: Never hardcode a calculated value
  • Verify the balance sheet balances: A = L + E for every period
  • Normalization transparency: Every add-back must be documented with dollar impact and rationale
  • Be skeptical of recurring “non-recurring” items: Annual restructuring charges are recurring
  • Test formulas before delivery: Spot-check calculated values manually
  • Follow the blue/black/green convention: It is an industry standard that enables instant recognition

Dependencies

Required:
  • XLSX skill for Excel workbook creation
  • Source financial data (CIM, SEC filings, or management materials)
Optional:
  • SEC EDGAR for public company filings
  • Industry databases for market analysis data
  • QoE report for validated EBITDA adjustments