Ask a procurement manager how they chose their last strategic supplier and you will often hear some version of "they had the best overall fit." Push a little harder and "best fit" usually means "we had a good feeling about them." That feeling costs money. Structured evaluation criteria turn gut decisions into defensible, auditable procurement choices.
The six categories below cover the full landscape of supplier evaluation. Each includes concrete measurement examples, suggested weight ranges, and scoring guidance. Use them as a starting point — adapt and weight them based on what matters most for each specific sourcing event.
1. Quality criteria
Quality is the most common evaluation criterion — and the most poorly measured. "High quality" is not a criterion. A measurable quality criterion has a specific standard and a method of verification.
Example criteria with scoring rubrics:
- Defect rate (PPM). Score 1–5 based on verified parts-per-million defect data. 1 = >10,000 PPM, 5 = <100 PPM. Ask for 12 months of trend data, not a single-point claim.
- Quality certifications. Award points for each relevant certification held. ISO 9001 = +1, IATF 16949 (automotive) = +1, ISO 13485 (medical) = +1, with a cap at 5.
- First-pass yield. Percentage of units passing inspection on first attempt. 1 = <90%, 3 = 95-97%, 5 = >99%. This reveals process capability better than final inspection rates.
- Customer complaint rate. Number of quality complaints per 1,000 units shipped over the last 12 months. Verify through reference calls, not just supplier-provided data.
Suggested weight range: 20-35% for most categories. Higher for regulated industries (medical, aerospace, food).
2. Cost criteria
Cost evaluation goes beyond the unit price on the quote. Total Cost of Ownership (TCO) captures what you will actually pay. A supplier with a 10% lower unit price but 3× the logistics cost is not cheaper.
Example criteria with scoring rubrics:
- Total landed cost. Unit price + freight + duties + insurance + handling. Score 1–5 on total landed cost per unit, not quoted price.
- Payment terms. Score based on net payment days offered. 1 = Net 15, 3 = Net 45, 5 = Net 90+. Longer terms improve your working capital.
- Volume discount structure. Evaluate the slope of the discount curve. A flat discount schedule scores lower than one that rewards growing partnership.
- Cost transparency. Does the supplier provide an open-book cost breakdown (materials, labor, overhead, margin)? Open-book = 5, black-box pricing = 1. This is critical for long-term cost management.
Suggested weight range: 15-30%. Higher when sourcing commodities or low-differentiation categories; lower when quality or innovation are the primary drivers.
3. Delivery & logistics criteria
A perfect-quality product that arrives two weeks late is a problem. Delivery criteria measure reliability, speed, and flexibility.
Example criteria with scoring rubrics:
- On-time delivery rate (OTD). Percentage of orders delivered by the promised date. Score 1–5: 1 = <85%, 3 = 94-96%, 5 = >99%. Verify with 12-month data. Distinguish between OTD to requested date and OTD to promised date.
- Lead time vs. market benchmark. Compare the supplier's standard lead time against the industry average. At or below benchmark = 5, >30% over = 1. Shorter lead times reduce your inventory holding costs.
- Geographic proximity. Distance from the supplier's shipping point to your receiving location. Score based on standard transit lanes: same-region = 5, same-continent = 3, intercontinental = 1.
- Expediting capability. Can the supplier handle a rush order? What is the surcharge? A documented expedite process with a cap of <15% surcharge = 5. No expedite option = 1.
Suggested weight range: 10-20%. Higher for JIT manufacturing or categories where stockouts directly stop production.
4. Risk & compliance criteria
Too many teams evaluate risk informally. A structured risk assessment during supplier selection prevents the scrambling that happens after a disruption.
Example criteria with scoring rubrics:
- Financial stability. Use a third-party credit rating (e.g., D&B rating, or an equivalent). Investment-grade = 5, high risk of default = 1. For private suppliers, request audited financials or bank references.
- Business continuity plan (BCP). Does the supplier have a documented, tested BCP? Documented and tested within 12 months = 5, documented but untested = 3, no BCP = 1.
- Single-source dependency. Is the supplier your only qualified source for this category? Single-source with no alternatives = 1, one of three or more qualified sources = 5. This measures your exposure, not the supplier's fault.
- Cybersecurity posture. SOC 2 Type II or ISO 27001 certification = +3 points. Documented incident response plan = +1. Cyber insurance = +1. No formal program = 0.
- Regulatory compliance. Map required regulations to supplier capabilities. Each regulatory gap = -1 from a baseline of 5.
Suggested weight range: 10-20%. Increase to 20-25% for suppliers handling sensitive data, critical infrastructure, or operating in politically volatile regions.
5. Capacity & scalability criteria
A supplier that is perfect today but maxed out on capacity cannot support your growth. Capacity evaluation ensures your supplier can scale with you — or warns you early that they cannot.
Example criteria with scoring rubrics:
- Current capacity utilization. What percentage of their production capacity is currently committed? 1 = >95% (no room), 3 = 75-85% (some headroom), 5 = <60% (significant headroom). Above 90%, even small demand spikes cause delays.
- Scalability track record. Has the supplier successfully scaled production for a client in the last three years? Documented case with metrics = 5, anecdotal claim = 2, no track record = 1.
- Your share of supplier revenue. What percentage of the supplier's total revenue would you represent? 5-15% = 5 (important enough for priority treatment, not so large you create dependency risk). <1% = 1 (you will not be a priority). >30% = 3 (you have leverage but the supplier is dangerously dependent on you).
- Geographic diversification. Does the supplier operate from multiple production sites in different regions? Multi-site, multi-region = 5. Single site = 1. This measures resilience, not current output.
Suggested weight range: 10-15% for stable categories, 15-25% for high-growth categories or startups scaling rapidly.
6. Innovation & partnership criteria
For strategic suppliers, today's capabilities matter less than tomorrow's. Innovation criteria evaluate whether a supplier will help you improve over time, not just maintain the status quo.
Example criteria with scoring rubrics:
- R&D investment. Percentage of revenue invested in R&D. >5% = 5, 2-5% = 3, <2% = 1. Cross-reference with your industry: an R&D investment that is high for packaging may be low for electronics.
- Cost-reduction proposals. In the last 24 months, has the supplier proactively proposed cost-saving ideas? 3+ documented proposals = 5, 1-2 = 3, none = 1. Unsolicited proposals are a stronger signal than responses to your requests.
- Collaborative history. Has the supplier co-developed a product, process, or specification with a client? Verified co-development = 5. Transactional-only relationship = 1. This is the strongest predictor of future partnership value.
- IP portfolio. Number of active patents or proprietary processes relevant to your category. Patent count alone is weak — ask for evidence that patents translate to product or process differentiation.
Suggested weight range: 5-15%. Higher for categories where technological change is rapid or supplier-led innovation drives competitive advantage.
Putting it all together
The criteria above give you a starting set of 20+ measurable options. For any single sourcing event, select 6-8 criteria that align with your business priorities. Assign weights that reflect what matters — not what is easiest to measure. Then apply a consistent 1-5 scoring scale across all criteria so scores are directly comparable.
A defense contractor evaluating a precision machining supplier might weight Quality at 30%, Risk at 20%, Cost at 15%, Delivery at 15%, Capacity at 10%, and Innovation at 10%. A retailer evaluating a private-label manufacturer might invert those weights entirely — Cost at 30%, Delivery at 25%, Quality at 20%, with Capacity, Risk, and Innovation sharing the remainder. The criteria are reusable; the weights are situational.