AZIC Education

Antminer S21 Hashboard Repair Guide

Step-by-step guide to diagnose and repair Antminer S21 hashboards — voltage domain testing, BM1368 chip replacement, and common failure fixes.

Overview

The Bitmain Antminer S21 is a high-performance SHA-256 Bitcoin miner delivering approximately 200 TH/s across three hashboards. Each hashboard contains 129 BM1368 ASIC chips organized into 12 voltage domains of 10–11 chips each. The BM1368 is manufactured on a 5nm process node, making it one of the most energy-efficient mining chips available.

This guide provides a complete, step-by-step procedure for diagnosing and repairing S21 hashboards. Whether you are dealing with missing chips, dead voltage domains, low hashrate, or complete board failure, this guide covers the systematic approach used by professional repair technicians.

Safety First: The S21 PSU (APW17) delivers 12V DC at up to 300A. Always disconnect power and wait at least 60 seconds for capacitor discharge before handling hashboards. Wear an ESD wrist strap connected to a grounded mat at all times. Failure to follow proper safety procedures can result in serious injury or permanent damage to components. See the ESD Safety Guide for complete ESD procedures.

Required Tools

Basic Tools

  • Digital multimeter — Fluke 15b+ or equivalent (must read to 0.01V accuracy)
  • Phillips #2 screwdriver — for enclosure disassembly
  • ESD wrist strap and mat — mandatory for all hashboard work
  • Compressed air — filtered, moisture-free (for dust removal)
  • 99% isopropyl alcohol (IPA) — for cleaning flux residue and thermal paste
  • Lint-free microfiber cloths — for cleaning
  • Magnifying glass or loupe (10x–20x) — for visual inspection
  • Bright LED work light — angled lighting reveals solder defects

Advanced Tools (for component-level repair)

  • Hot air rework station — Quick 861DW or Hakko FR-810 (for BGA rework)
  • Soldering iron — Hakko FX-951 or JBC CD-2BE with fine tip (for discrete components)
  • Flux — Amtech NC-559-V2 or equivalent no-clean flux
  • Solder paste — SAC305 or 63/37 leaded (for BGA reballing)
  • Solder wick — 2mm width for pad cleaning
  • Thermal paste — Arctic MX-5 or Thermal Grizzly Kryonaut
  • Oscilloscope (optional) — for signal chain analysis
  • Thermal camera (optional) — FLIR or equivalent for hot spot detection
  • BGA stencil — for BM1368 reballing if needed

Prerequisites

Before starting this repair, you should:

  1. Be comfortable using a digital multimeter for DC voltage and continuity measurements — see our Multimeter Testing Guide
  2. Understand the basics of hashboard architecture — see How Hashboards Work
  3. Have a properly set up ESD-safe workspace — see ESD Safety Guide
  4. Have access to the tools listed above
  5. Be familiar with basic soldering if performing component-level repair — see Soldering Techniques

Hashboard Specifications

ParameterValue
Miner ModelAntminer S21
ManufacturerBitmain
ASIC ChipBM1368 (5nm SHA-256)
Hashrate per Board~67 TH/s
Total Hashrate (3 boards)~200 TH/s
Chips per Board129
Voltage Domains12
Chips per Domain10–11
Core Voltage (VDD)0.30V ±0.02V per domain
I/O Voltage (VDDIO)1.8V
Logic Voltage (VDD33)3.3V
Input Voltage12V DC
Power per Board~1167W
Signal ChainCLK, CI (Command In), RI (Response In), RST, BO
Connector18-pin hashboard connector
PSUAPW17 (3600W)

Repair Procedure

Step 1: Visual Inspection

Begin every repair with a thorough visual inspection. Many common failures are visible to the naked eye or under magnification.

Remove the hashboard from the miner enclosure:

  1. Disconnect all power cables from the PSU
  2. Wait 60 seconds for capacitor discharge
  3. Remove the top cover (4 Phillips screws)
  4. Disconnect the 18-pin hashboard data cable from the control board
  5. Disconnect the power cables from the hashboard
  6. Slide the hashboard out of its slot

Inspect under bright, angled light for:

  • Burnt or discolored components — darkened areas around chips or voltage regulators indicate thermal damage or short circuits. Pay special attention to the buck converter areas between voltage domains.
  • Cracked or cold solder joints — look for dull, grainy solder joints instead of smooth, shiny ones. Common on the 18-pin connector and large capacitors.
  • Shifted or misaligned chips — BM1368 chips should be perfectly aligned on their pads. A shifted chip indicates a rework attempt or thermal cycling damage.
  • Swollen or bulging capacitors — electrolytic capacitors near voltage regulators that are domed on top have failed.
  • Corrosion or liquid damage — green or white deposits indicate moisture exposure. Clean with IPA before further diagnostics.
  • Damaged or burnt traces — dark lines on the PCB surface indicate a trace has carried excessive current.
  • Flux residue buildup — excessive flux around chips can indicate prior rework. Clean with IPA to inspect the actual joint quality beneath.
  • Bent or damaged connector pins — check the 18-pin hashboard connector for bent, pushed-back, or oxidized pins.

Pro tip: Use a thermal camera on a powered board (if it partially works) before disassembly. Hot spots visible on thermal imaging instantly identify shorted chips or failed regulators, saving significant diagnostic time.

Document your findings before proceeding. Take photos of any suspicious areas — you will reference these during component-level diagnosis.

Step 2: Voltage Domain Testing

Voltage domain testing is the most important diagnostic step. Each of the 12 voltage domains on the S21 hashboard is powered by its own buck converter (voltage regulator) that steps 12V input down to approximately 0.30V for the ASIC chip cores.

What you need:

  • Digital multimeter set to DC voltage (200mV or 2V range)
  • Hashboard removed from miner (no power needed for this initial test — we measure resistance first)

2a: Domain Resistance Check (Power Off)

Before applying power, check each domain's resistance to identify shorts:

  1. Set your multimeter to resistance mode (Ω)
  2. Place the red probe on the positive output pad of each domain's buck converter
  3. Place the black probe on the ground plane (any large ground pad or screw hole)
  4. A healthy domain reads 2–10Ω
  5. A reading of 0Ω or near 0Ω indicates a short circuit in that domain — do NOT power the board
  6. An open reading (OL/∞) indicates a broken connection
DomainExpected ResistanceInterpretation
Any2–10ΩNormal
Any0–0.5ΩShort circuit — shorted chip or capacitor
AnyOL (open)Broken trace or lifted regulator
Any>50ΩPossible open connection

2b: Powered Voltage Measurement

Only power the board if no shorts were detected in Step 2a. Powering a shorted domain will cause further damage and may destroy the buck converter.

To measure domain voltages under power, you need to connect the hashboard to a test fixture or bench power supply capable of delivering 12V at sufficient current (at least 20A for a single domain test, or the full PSU for all-domain measurement).

For each of the 12 voltage domains:

  1. Set your multimeter to DC voltage (2V range)
  2. Place the black probe on ground
  3. Place the red probe on the positive voltage output pad for that domain
  4. Record the reading
Domain #Expected VoltageStatus
1–120.28–0.32VNormal
Any0VDead domain — regulator not switching, or shorted chip pulling voltage to ground
Any>0.35VHigh — possible open chip (fewer chips sharing the current), regulator issue
AnyFluctuatingUnstable — intermittent connection, failing chip, or regulator oscillation

Interpreting Results:

  • All domains at 0V: Check 12V input to the board, check the 18-pin connector, and verify the main input MOSFET is conducting.
  • Single domain at 0V: The buck converter for that domain has failed, OR a chip in that domain is shorted pulling voltage to ground. Check the domain's regulator and then individual chip resistances.
  • Multiple adjacent domains at 0V: Possible connector issue, or a cascading failure from one domain affecting neighboring power traces.
  • Domain reading high (>0.35V): One or more chips in that domain are open (dead but not shorted). The regulator compensates by raising voltage for the remaining chips.

Record all 12 domain voltages in a table. This is your diagnostic baseline.

Step 3: Signal Chain Testing

The BM1368 chips are connected in a daisy chain for communication. The control board sends commands through the CI (Command In) line, and responses return via the RI (Response In) line. A break anywhere in the chain prevents communication with all chips after the break point.

Signal Lines to Test:

SignalPurposeExpected VoltageNotes
CLKClock signal25MHz square wave (1.8V amplitude)Use oscilloscope for verification
CICommand In1.8V idle, pulses during communicationChain input from control board
RIResponse In1.8V idle, pulses during communicationChain output back to control board
RSTReset1.8V when active, 0V when resetActive high
BOBootup/BootstrapVariesUsed during chip initialization

Testing Procedure:

  1. CLK line continuity: Using the multimeter in continuity mode, verify the CLK trace is continuous from the connector through each chip. A break in CLK will prevent all downstream chips from operating.

  2. CI/RI chain integrity: The CI signal enters chip #0 and exits as the CI input for chip #1, forming a chain through all 129 chips. To identify a break:

    • If the miner reports "chain find 0 ASIC" — the break is before chip #0 or the first chip is dead
    • If it reports some chips detected (e.g., "chain find 80 ASIC" instead of 129) — the chain breaks at approximately chip #80. Use binary search (dichotomy) to narrow down
  3. Binary search method for chain breaks:

    • Probe the CI/RI signal at the midpoint chip (#64)
    • If signal is present, the break is in chips #65–129
    • If signal is absent, the break is in chips #0–64
    • Continue halving until you identify the exact break point
  4. RST line check: Verify the RST signal reaches all chips. An open RST line holds chips in reset permanently.

An oscilloscope makes signal chain testing much faster and more reliable. If you are performing regular hashboard repair, investing in even a basic 2-channel oscilloscope (such as the Rigol DS1054Z) is highly recommended. However, for most common failures, multimeter continuity testing of the chain is sufficient.

Step 4: Component-Level Diagnosis

Once you have identified the problem area (failed domain, broken chain segment, or specific symptoms), perform component-level testing.

4a: Individual Chip Testing

To test if a specific BM1368 chip is shorted:

  1. Set your multimeter to diode mode
  2. Place the black probe on the chip's ground pad
  3. Place the red probe on the chip's VDD (core voltage) pad
  4. A healthy chip shows a forward voltage drop of 0.3–0.6V
  5. A reading of 0V or very low (0.01–0.05V) indicates a shorted chip
  6. An OL (open) reading may indicate a lifted chip or broken solder joint

4b: Buck Converter Testing

Each voltage domain's buck converter consists of:

  • A switching controller IC
  • High-side and low-side MOSFETs
  • An output inductor
  • Input and output capacitors

To test the buck converter:

  1. Input voltage: Verify 12V reaches the converter input
  2. MOSFET gate signals: With an oscilloscope, verify the controller is producing switching signals
  3. Output inductor: Check continuity — an open inductor kills the domain
  4. Output capacitors: Check for shorts — a shorted capacitor pulls the domain to ground

4c: Capacitor Testing

Capacitors can fail short or open:

  • Shorted capacitor: Domain voltage reads 0V, capacitor body may be discolored
  • Open capacitor: Domain voltage may be noisy or slightly high
  • Test by lifting one leg of the suspect capacitor and measuring the domain again

4d: Temperature Sensor Check

The S21 hashboard has temperature sensors (typically LM75A or similar) connected via I2C. If the miner reports "temperature too high" errors even at normal ambient:

  • Check the I2C pull-up resistors (typically 4.7kΩ to 3.3V)
  • Verify sensor continuity to the I2C bus
  • A failed sensor may report incorrect readings causing the miner to throttle or shut down

Step 5: BM1368 Chip Replacement (BGA Rework)

The BM1368 uses a BGA (Ball Grid Array) package. Replacing a BGA chip requires a hot air rework station and proper technique.

BGA rework requires practice. If you have not performed BGA rework before, practice on scrap boards first. A botched rework can lift pads, bridge connections, or damage adjacent chips. See our Soldering Techniques Guide for detailed BGA rework instructions.

Removal Procedure:

  1. Prepare the workspace: Ensure ESD protection is active. Place the hashboard on a preheater or heat-resistant mat.

  2. Apply flux: Apply Amtech NC-559-V2 flux generously around the target chip. The flux prevents oxidation during heating and helps the solder flow cleanly.

  3. Preheat the board: Set the preheater (or bottom heater of your rework station) to 150°C. Allow the board to soak at this temperature for 2–3 minutes. Preheating reduces thermal shock and prevents PCB warping.

  4. Heat the chip: Using the hot air station with a nozzle sized to the BM1368 package:

    • Air temperature: 350–380°C
    • Airflow: Medium (40–50%)
    • Hold the nozzle 5–10mm above the chip
    • Move in a slow circular pattern
    • The chip will release in approximately 60–90 seconds when the solder reaches reflow temperature (~230°C at the joint)
  5. Remove the chip: When you see the solder become liquid (the chip may shift slightly), gently lift the chip straight up using vacuum tweezers or a suction tool. Do not twist or slide the chip — this will smear solder across adjacent pads.

  6. Clean the pads: Apply fresh flux to the exposed pads and use solder wick with a soldering iron at 350°C to remove excess solder. The pads should be flat and clean with a thin, even solder coating.

Installation Procedure:

  1. Inspect the new chip: Verify the BM1368 replacement chip has intact solder balls. If balls are missing or deformed, the chip needs reballing using a BGA stencil and solder paste.

  2. Apply flux to the pads: A thin, even layer of flux on the cleaned PCB pads.

  3. Align the chip: Place the BM1368 on the pads, aligning the orientation marker (dot or triangle) with the PCB marking. The BGA balls should sit centered on their respective pads.

  4. Reflow: Repeat the preheating and hot air process:

    • Preheat to 150°C (2–3 minutes soak)
    • Hot air at 350–380°C until the solder reflows
    • You will see the chip "snap" into alignment as the solder balls melt and surface tension centers the chip — this is called self-alignment
  5. Cool down: Allow the board to cool gradually. Do not use compressed air to force-cool, as rapid thermal change can crack solder joints.

  6. Clean flux residue: Use IPA and a brush to remove all flux residue around the replaced chip.

  7. Inspect under magnification: Verify the chip is properly aligned, no solder bridges are visible between pads, and no flux residue remains trapped under the chip.

Step 6: Verification and Testing

After completing the repair, verify the fix before declaring success.

6a: Post-Repair Resistance Check

Repeat the domain resistance check from Step 2a:

  • The repaired domain should read 2–10Ω
  • If it still reads near 0Ω, the replacement chip may be defective or another chip in the domain is also shorted
  • If it reads OL, verify the chip is properly soldered

6b: Powered Voltage Test

  1. Connect the hashboard to a test fixture or the miner
  2. Power on and measure all 12 domain voltages
  3. All domains should read 0.28–0.32V
  4. The repaired domain should match the others within ±0.02V

6c: Full Mining Test

  1. Reinstall the hashboard in the miner enclosure
  2. Connect all cables (18-pin data, power)
  3. Power on the miner and access the web dashboard
  4. Navigate to Miner StatusHash Board
  5. Verify:
    • All 129 chips are detected (ASIC count = 129)
    • All chips report a valid frequency
    • Hashrate is within expected range (~67 TH/s per board)
    • No error messages in the kernel log
# SSH diagnostic commands to verify repair
# Connect via SSH (default user: root)
ssh root@<miner-ip>

# Check chip status
cat /tmp/freq.txt

# View kernel log for errors
dmesg | grep -i "chain\|asic\|error"

# Check real-time hashrate
cgminer-api summary

6d: Burn-In Period

Run the repaired board for at least 24 hours before considering the repair complete:

  • Monitor hashrate stability — it should not fluctuate more than ±5%
  • Monitor temperature — all chips should be within the normal operating range (60–85°C)
  • Check for error messages in the kernel log periodically
  • A successful 24-hour burn-in confirms the repair is stable

Common Failure Patterns

SymptomLikely CauseDiagnostic StepFix
"Chain find 0 ASIC"Broken signal chain at chip #0, failed first chip, or connector issueStep 3 — check CI/RI at first chipReplace first chip or repair connector
Single domain at 0VShorted chip in domain pulling voltage to groundStep 2a resistance check on that domainIdentify and replace shorted chip
All domains at 0VNo 12V input, failed input MOSFET, or blown input fuseCheck 12V at board input connectorRepair input power path
Hashrate 50% of expectedHalf of domains failed or half of chips missingStep 2b — check all domain voltagesRepair failed domains
Intermittent hashrate dropsThermal cycling causing cracked solder joint, loose chipRun thermal camera during operationReflow or replace affected chip
"Temp too high" errorFailed temperature sensor or blocked airflowStep 4d — check temp sensorsReplace sensor or clean heatsink
Board detected but 0 hashrateCLK signal missing, firmware issueStep 3 — verify CLK with oscilloscopeRepair CLK trace or reflash firmware
Random chip drops during operationMarginal solder joint, failing chipMonitor chip count over timeReflow suspicious chips or replace

Troubleshooting FAQ

How many ASIC chips does the S21 hashboard have?

Each Antminer S21 hashboard contains 129 BM1368 ASIC chips organized into 12 voltage domains of 10–11 chips each. A fully functional S21 with 3 hashboards has 387 chips total.

What voltage should each domain read on the S21?

Each voltage domain should read approximately 0.30V (range: 0.28–0.32V). The I/O voltage rail should read 1.8V and the logic rail 3.3V.

Can I run the S21 with a missing chip?

Technically yes — if a single chip is removed (desoldered), the remaining chips in that domain will still operate but at reduced hashrate. However, this is not recommended as a permanent solution. The domain voltage will be slightly higher than normal to compensate.

What is the binary search (dichotomy) method for finding a bad chip?

Instead of testing all 129 chips individually, you measure the signal chain at the midpoint. If the signal is present, the fault is in the second half. If absent, it is in the first half. Repeat this halving process until you isolate the exact faulty chip. This reduces 129 tests to about 7.

Why does my S21 show "chain find 0 ASIC" on one board?

This means the control board cannot communicate with any chip on that hashboard. Common causes: bad 18-pin connector, broken CI signal trace near chip #0, failed first chip in the chain, or a shorted chip pulling the entire signal low. Start with connector inspection, then trace the CI signal from the connector to chip #0.

How long should BM1368 chip replacement take?

For an experienced technician, a single BGA chip replacement takes 15–30 minutes including removal, pad cleaning, placement, and reflow. Allow additional time for cooling and inspection. The full diagnostic process for a failed board typically takes 1–2 hours.

Can I use leaded solder for BM1368 rework?

Yes. Many professional repair technicians prefer 63/37 leaded solder (melting point ~183°C) over lead-free SAC305 (~217°C) because it flows at lower temperatures, reducing thermal stress on the PCB and adjacent components. However, be aware that mixing leaded and lead-free solder can create weak joints if temperatures are not managed correctly.

What is the expected lifespan of a properly repaired S21 hashboard?

A properly repaired hashboard should perform identically to a new one. The BM1368 chips themselves are rated for years of continuous operation. The most common failure recurrence is from inadequate flux cleaning (causes corrosion over time) or thermal paste degradation (requires replacement every 12–18 months under continuous operation).