Buyer's Guide
Buying beer OEM is partly a numbers exercise and partly a sensory decision. A spec sheet tells you ABV, bitterness units, and carbohydrate content. It does not tell you whether the beer will sell. That judgment requires a structured tasting — and a disciplined way of separating what you personally prefer from what your market actually needs.
Published 17 June 2026 · By the JINPAI Brewery production team
When a consumer drinks a beer, they are deciding whether they enjoyed it. When a buyer evaluates a sample, they are making a procurement decision on behalf of thousands of consumers who may have preferences nothing like their own. That distinction matters. A buyer who rejects a dry, bitter lager because they personally prefer sweeter beer is making the wrong call if their target demographic — weekend sports fans, fitness-oriented drinkers, hospitality accounts — actively seeks that dryness.
Professional beer evaluation separates two different questions. The first: is this beer made correctly? The second: is it the right product for the brief? Both require sensory input, but they use different frames. The first is a quality question with defensible right and wrong answers. The second is a commercial question that only the buyer can answer — but answering it with rigour still requires knowing what you are tasting.
The Beer Judge Certification Program (BJCP) and Cicerone curriculum both teach structured evaluation for a reason. Trained evaluators consistently outperform untrained ones on detection accuracy and inter-rater agreement. You do not need a certification to evaluate an OEM sample, but you do need a method. Without one, you will end up with a room full of opinions and no decision-making framework.
Conditions matter more than most buyers realise. A beer that smells clean in a properly ventilated room at 10°C can smell flat or harsh straight from a 4°C fridge in a kitchen that doubles as a meeting room. Getting the setup right is not fussiness — it is controlling the variables that sit between the product and your judgment.
Use tulip glasses or standard tasting glasses — a shape with a slight inward taper at the rim that concentrates aroma. Never use frosted glasses; the ice residue dilutes and chills the sample unevenly. All glasses must be identical in shape and size, rinsed with the sample beer before the evaluation pour (called "beer-rinsing"), and odour-free — any detergent or dishwasher residue will register on the nose and kill fine aroma perception.
Standard evaluation temperature for most commercial lager and ale styles is 8–12°C. At 4°C, aroma is suppressed and bitterness is muted — you are not tasting the beer, you are tasting cold. At 16°C and above, volatile compounds off-gas quickly and flaws that are normally minor become dominant. Remove samples from refrigeration 15–20 minutes before tasting. For aromatic styles — dry-hopped ales, fruit beers, tea beers — 10–14°C gives the best aroma window.
Evaluate on an empty or near-empty stomach — a full lunch suppresses bitterness and sweet perception both. Between samples, cleanse with still water and plain unsalted crackers or white bread. Avoid coffee, citrus, or strongly flavoured foods for at least an hour before the session. Sequence matters: evaluate lower-ABV, lower-bitterness, paler samples first and progress toward heavier, darker, or more intensely flavoured ones. Going the other direction means the first heavy sample masks everything that follows.
When comparing multiple samples from different suppliers, blind tasting with randomised coded labels is the standard. Knowing which glass came from your preferred supplier creates a halo effect that is well-documented and measurable. When evaluating a single sample against a written brief, identified tasting is fine. The key variable is bias, not secrecy — control for it accordingly.
Every structured beer evaluation follows the same sequence. These are not four equally weighted categories — aroma and flavour carry the most information, but the other two catch specific classes of problems that nose and palate alone will miss.
Pour 150–180 ml into a beer-rinsed glass held at a 45-degree angle. Observe colour, clarity, and head. Colour should be consistent with the style brief: a standard commercial lager is straw to pale gold (5–8 SRM); a dark lager runs 15–25 SRM. Clarity in filtered commercial beers should be bright — visible haze in a filtered lager is a filtration or microbiological flag, not a stylistic choice. Head retention matters commercially: assess whether it forms, how dense it is, and how quickly it collapses. A head that vanishes in 20 seconds indicates low protein content, which may be correct for a light lager but is worth noting.
Swirl the glass once and nose it immediately — volatile aroma compounds dissipate fast. Take two or three short sniffs rather than one long inhale; olfactory fatigue sets in quickly. You are looking for: base character (malt sweetness, grain, bread), hop character (if any — floral, herbal, citrus, resinous), fermentation character (clean, fruity esters, or off-note), and anything that should not be there. Prioritise detection over description on the first pass — decide whether the aroma is clean and appropriate before you start writing.
Take a sip of 15–20 ml and hold it for 2–3 seconds before swallowing. Assess: initial sweetness or dryness on entry, bitterness build and timing (does it arrive early and fade, or come late and linger?), flavour balance (does any one element dominate inappropriately?), and finish length. Bitterness is measured in IBUs but perceived bitterness depends heavily on residual sweetness — a 25 IBU beer with low attenuation tastes less bitter than a 20 IBU beer at high attenuation. Evaluate bitterness in context, not as a raw number.
Body (light, medium, full), carbonation level (count the time CO2 sensation lasts on the palate — a light lager typically feels crisp and fades quickly; a stout lingers), and finish texture (dry, astringent, warming, smooth). Overcarbonation creates a sharp prickle that masks flavour; undercarbonation makes a beer feel flat and heavy regardless of its actual attenuation. For export packaged beer, carbonation at source may differ from what arrives — note the batch date and conditions.
You cannot evaluate a beer without knowing what it is supposed to be. A sample reviewed in isolation gives you impressions. A sample reviewed against its brief gives you a verdict. Before the tasting session, prepare a one-page reference for each sample: style name, target ABV, target IBU, target SRM (colour), target attenuation, and any specific inclusions (fruit addition, hop variety, adjunct). These are your benchmarks.
For standard commercial lager — the highest-volume OEM product category — the reference parameters are well-established. A mass-market export lager typically runs 4.2–5.0% ABV, 8–15 IBU, SRM 3–6, with apparent attenuation above 78%. The aroma should be neutral to faintly malty with no detectable hops. The flavour should be clean, lightly sweet on entry, with a mild bitterness in the finish that does not persist. Mouthfeel: light-bodied, well-carbonated, clean exit. Any meaningful deviation from that profile is a data point — is it intentional (within brief) or unintentional (outside brief)?
For fruit beers and specialty adjunct styles, the reference frame shifts. Fruit character should be identifiable but proportionate: a lychee beer should read as lychee, not as generic floral sweetener. Acidity from fruit additions should be clean and consistent across the sample, not patchy. Colour should be stable, not oxidised brown from a fruit addition that was processed too aggressively. In all cases, the key question is: does this sample hit the parameters in the brief, and does the sensory experience match the product concept you agreed on?
When JINPAI sends an OEM sample, it is accompanied by a technical data sheet stating the actual measured values: ABV, IBU, SRM, original gravity, final gravity, attenuation, and any relevant addition levels. The buyer's job is to taste the sample against those numbers and against the agreed product concept. If the spec sheet says 12 IBU and the beer tastes noticeably more bitter than a standard lager, something does not reconcile — and that gap is worth investigating before confirming the formulation.
Verbal tasting notes are not a decision record. They depend on memory, individual vocabulary, and the social dynamics of whoever speaks first. A scorecard fixes the evaluation at the moment of tasting and creates a document that can be shared, stored, and compared across sessions. This matters for OEM sourcing because you may be evaluating a second batch six months after the first — you need a baseline to compare against, not a recollection.
A practical buyer's scorecard does not need to be the 50-point BJCP sheet. It needs five things: a numerical score for each of the four evaluation categories (1–5 is sufficient), an overall score, a free-text field for notable positives, a free-text field for concerns or deviations, a verdict field (Accept / Accept with modification / Reject), and the tasting conditions (date, temperature, evaluator names, batch code). That is one page per sample. Completed by each evaluator independently before any group discussion, then compared. Disagreements between evaluators are where the most useful information lives.
Attach the scorecard to the sample request and the subsequent purchase order. If there is a quality dispute three months into the supply relationship, the scorecard is the paper trail that establishes what was agreed. Do not rely on email chains and meeting notes for a sensory specification — formalise it.
Beer ages. Some of what you detect in a sample is the beer's recipe character; some of it is the age of the specific bottle in your hand. Knowing the difference is essential for OEM evaluation. Check the batch code and production date before you taste. A sample that is four months old and has been stored unrefrigerated is not a fair representative of the product — it is a fair representative of what the product becomes under poor storage. If you are evaluating freshness potential, you need the earliest available batch from cold storage.
The main freshness indicators to assess: hop aroma fades first and fastest — any beer described as hop-forward should show that character clearly in a fresh sample. Malt character is more stable. Esters and fermentation-derived flavours are moderately stable in well-made beer. The first oxidation notes in a packaged lager typically arrive as a cardboard, papery, or wet-bread character that is distinctly different from the malt sweetness of a fresh sample. If you detect that note, ask for the batch date before drawing any conclusion about the formulation.
This is the most important distinction in buyer-side evaluation, and the most commonly ignored. "I don't like this" is a statement about the evaluator. "This is wrong" is a statement about the product. They lead to entirely different decisions, and conflating them is expensive.
A beer can be technically correct — well-made, true to style, hitting every parameter in the brief — and still not appeal to a particular person's palate. That is fine. The question is not whether the buyer likes it; the question is whether the target consumer will, and whether the product delivers the market positioning the brand requires. If the brief calls for a dry, bitter light lager and the sample delivers exactly that, an evaluator who prefers sweet beers should note their preference and approve the sample. Rejecting a correctly made product because of personal palate is a procurement error, not a quality control decision.
Conversely, a beer can taste appealing to an evaluator while still being wrong. A butter note can read as richness to an untrained palate; it is diacetyl to a trained one, and it fails on a microbiological timeline. A slightly hazy pale lager might look artisanal to a craft-aware buyer; it is an unresolved filtration issue that will produce variable batches. The only way to catch these is to know what the faults are and actively look for them — not to rely on whether you like the overall impression.
The scorecard enforces this separation. By scoring each category independently and noting deviations against the brief rather than against personal preference, evaluators have to articulate their reasoning. "Reject — diacetyl present at detectable level, not appropriate to the style" is a decision. "Reject — didn't like it" is not one you can action, document, or defend to a factory.
Standard evaluation temperature for most commercial lager and ale styles is 8–12°C — cold enough to reflect consumer serving conditions but not so cold that aroma is suppressed. Tasting a beer at 4°C (straight from the fridge) suppresses both aroma and bitterness perception. Tasting at 18°C amplifies all flavors, including flaws. For aromatic beer styles (dry-hopped ales, tea beers, fruit beers), tasting at 10–14°C gives the best aroma expression.
The practical limit for trained evaluators is 8–12 samples per session with adequate palate-cleansing between each. Beyond 12 samples, palate fatigue sets in and discrimination accuracy drops. For untrained evaluators (marketing teams, brand owners) the useful limit is 4–6. When evaluating multiple variants of the same base style (e.g., three hop rate options), blind tasting with randomized order is strongly recommended to eliminate presentation bias.
For a meaningful comparison: both samples should be from the same package format, the same production date if possible, stored under identical conditions for the same duration, served at the same temperature in identical glassware, and evaluated by the same panel in the same session. Any of these variables, if uncontrolled, can produce differences in the tasting that have nothing to do with the actual product difference you are trying to evaluate.
OEM beer procurement is a sensory decision as much as a commercial one. A tasting without a method produces a room full of opinions. A structured evaluation — controlled conditions, a four-step framework, a documented scorecard, and clear separation between fault detection and personal preference — produces a decision you can stand behind, communicate clearly to a supplier, and use as a baseline for quality management across the supply relationship.
JINPAI supplies samples with full technical data sheets for every product in the range. Our export team will walk through the parameters with you before the tasting session and answer questions on any deviation you find. If you are sourcing for a new market or comparing formats, send us the brief — target consumer, product positioning, volume, destination market — and we will configure the sample package accordingly.