Antminer S21 Hashboard Repair Guide
Step-by-step guide to diagnose and repair Antminer S21 hashboards — voltage domain testing, BM1368 chip replacement, and common failure fixes.
Overview
The Bitmain Antminer S21 is a high-performance SHA-256 Bitcoin miner delivering approximately 200 TH/s across three hashboards. Each hashboard contains 129 BM1368 ASIC chips organized into 12 voltage domains of 10–11 chips each. The BM1368 is manufactured on a 5nm process node, making it one of the most energy-efficient mining chips available.
This guide provides a complete, step-by-step procedure for diagnosing and repairing S21 hashboards. Whether you are dealing with missing chips, dead voltage domains, low hashrate, or complete board failure, this guide covers the systematic approach used by professional repair technicians.
Safety First: The S21 PSU (APW17) delivers 12V DC at up to 300A. Always disconnect power and wait at least 60 seconds for capacitor discharge before handling hashboards. Wear an ESD wrist strap connected to a grounded mat at all times. Failure to follow proper safety procedures can result in serious injury or permanent damage to components. See the ESD Safety Guide for complete ESD procedures.
Required Tools
Basic Tools
- Digital multimeter — Fluke 15b+ or equivalent (must read to 0.01V accuracy)
- Phillips #2 screwdriver — for enclosure disassembly
- ESD wrist strap and mat — mandatory for all hashboard work
- Compressed air — filtered, moisture-free (for dust removal)
- 99% isopropyl alcohol (IPA) — for cleaning flux residue and thermal paste
- Lint-free microfiber cloths — for cleaning
- Magnifying glass or loupe (10x–20x) — for visual inspection
- Bright LED work light — angled lighting reveals solder defects
Advanced Tools (for component-level repair)
- Hot air rework station — Quick 861DW or Hakko FR-810 (for BGA rework)
- Soldering iron — Hakko FX-951 or JBC CD-2BE with fine tip (for discrete components)
- Flux — Amtech NC-559-V2 or equivalent no-clean flux
- Solder paste — SAC305 or 63/37 leaded (for BGA reballing)
- Solder wick — 2mm width for pad cleaning
- Thermal paste — Arctic MX-5 or Thermal Grizzly Kryonaut
- Oscilloscope (optional) — for signal chain analysis
- Thermal camera (optional) — FLIR or equivalent for hot spot detection
- BGA stencil — for BM1368 reballing if needed
Prerequisites
Before starting this repair, you should:
- Be comfortable using a digital multimeter for DC voltage and continuity measurements — see our Multimeter Testing Guide
- Understand the basics of hashboard architecture — see How Hashboards Work
- Have a properly set up ESD-safe workspace — see ESD Safety Guide
- Have access to the tools listed above
- Be familiar with basic soldering if performing component-level repair — see Soldering Techniques
Hashboard Specifications
| Parameter | Value |
|---|---|
| Miner Model | Antminer S21 |
| Manufacturer | Bitmain |
| ASIC Chip | BM1368 (5nm SHA-256) |
| Hashrate per Board | ~67 TH/s |
| Total Hashrate (3 boards) | ~200 TH/s |
| Chips per Board | 129 |
| Voltage Domains | 12 |
| Chips per Domain | 10–11 |
| Core Voltage (VDD) | 0.30V ±0.02V per domain |
| I/O Voltage (VDDIO) | 1.8V |
| Logic Voltage (VDD33) | 3.3V |
| Input Voltage | 12V DC |
| Power per Board | ~1167W |
| Signal Chain | CLK, CI (Command In), RI (Response In), RST, BO |
| Connector | 18-pin hashboard connector |
| PSU | APW17 (3600W) |
Repair Procedure
Step 1: Visual Inspection
Begin every repair with a thorough visual inspection. Many common failures are visible to the naked eye or under magnification.
Remove the hashboard from the miner enclosure:
- Disconnect all power cables from the PSU
- Wait 60 seconds for capacitor discharge
- Remove the top cover (4 Phillips screws)
- Disconnect the 18-pin hashboard data cable from the control board
- Disconnect the power cables from the hashboard
- Slide the hashboard out of its slot
Inspect under bright, angled light for:
- Burnt or discolored components — darkened areas around chips or voltage regulators indicate thermal damage or short circuits. Pay special attention to the buck converter areas between voltage domains.
- Cracked or cold solder joints — look for dull, grainy solder joints instead of smooth, shiny ones. Common on the 18-pin connector and large capacitors.
- Shifted or misaligned chips — BM1368 chips should be perfectly aligned on their pads. A shifted chip indicates a rework attempt or thermal cycling damage.
- Swollen or bulging capacitors — electrolytic capacitors near voltage regulators that are domed on top have failed.
- Corrosion or liquid damage — green or white deposits indicate moisture exposure. Clean with IPA before further diagnostics.
- Damaged or burnt traces — dark lines on the PCB surface indicate a trace has carried excessive current.
- Flux residue buildup — excessive flux around chips can indicate prior rework. Clean with IPA to inspect the actual joint quality beneath.
- Bent or damaged connector pins — check the 18-pin hashboard connector for bent, pushed-back, or oxidized pins.
Pro tip: Use a thermal camera on a powered board (if it partially works) before disassembly. Hot spots visible on thermal imaging instantly identify shorted chips or failed regulators, saving significant diagnostic time.
Document your findings before proceeding. Take photos of any suspicious areas — you will reference these during component-level diagnosis.
Step 2: Voltage Domain Testing
Voltage domain testing is the most important diagnostic step. Each of the 12 voltage domains on the S21 hashboard is powered by its own buck converter (voltage regulator) that steps 12V input down to approximately 0.30V for the ASIC chip cores.
What you need:
- Digital multimeter set to DC voltage (200mV or 2V range)
- Hashboard removed from miner (no power needed for this initial test — we measure resistance first)
2a: Domain Resistance Check (Power Off)
Before applying power, check each domain's resistance to identify shorts:
- Set your multimeter to resistance mode (Ω)
- Place the red probe on the positive output pad of each domain's buck converter
- Place the black probe on the ground plane (any large ground pad or screw hole)
- A healthy domain reads 2–10Ω
- A reading of 0Ω or near 0Ω indicates a short circuit in that domain — do NOT power the board
- An open reading (OL/∞) indicates a broken connection
| Domain | Expected Resistance | Interpretation |
|---|---|---|
| Any | 2–10Ω | Normal |
| Any | 0–0.5Ω | Short circuit — shorted chip or capacitor |
| Any | OL (open) | Broken trace or lifted regulator |
| Any | >50Ω | Possible open connection |
2b: Powered Voltage Measurement
Only power the board if no shorts were detected in Step 2a. Powering a shorted domain will cause further damage and may destroy the buck converter.
To measure domain voltages under power, you need to connect the hashboard to a test fixture or bench power supply capable of delivering 12V at sufficient current (at least 20A for a single domain test, or the full PSU for all-domain measurement).
For each of the 12 voltage domains:
- Set your multimeter to DC voltage (2V range)
- Place the black probe on ground
- Place the red probe on the positive voltage output pad for that domain
- Record the reading
| Domain # | Expected Voltage | Status |
|---|---|---|
| 1–12 | 0.28–0.32V | Normal |
| Any | 0V | Dead domain — regulator not switching, or shorted chip pulling voltage to ground |
| Any | >0.35V | High — possible open chip (fewer chips sharing the current), regulator issue |
| Any | Fluctuating | Unstable — intermittent connection, failing chip, or regulator oscillation |
Interpreting Results:
- All domains at 0V: Check 12V input to the board, check the 18-pin connector, and verify the main input MOSFET is conducting.
- Single domain at 0V: The buck converter for that domain has failed, OR a chip in that domain is shorted pulling voltage to ground. Check the domain's regulator and then individual chip resistances.
- Multiple adjacent domains at 0V: Possible connector issue, or a cascading failure from one domain affecting neighboring power traces.
- Domain reading high (>0.35V): One or more chips in that domain are open (dead but not shorted). The regulator compensates by raising voltage for the remaining chips.
Record all 12 domain voltages in a table. This is your diagnostic baseline.
Step 3: Signal Chain Testing
The BM1368 chips are connected in a daisy chain for communication. The control board sends commands through the CI (Command In) line, and responses return via the RI (Response In) line. A break anywhere in the chain prevents communication with all chips after the break point.
Signal Lines to Test:
| Signal | Purpose | Expected Voltage | Notes |
|---|---|---|---|
| CLK | Clock signal | 25MHz square wave (1.8V amplitude) | Use oscilloscope for verification |
| CI | Command In | 1.8V idle, pulses during communication | Chain input from control board |
| RI | Response In | 1.8V idle, pulses during communication | Chain output back to control board |
| RST | Reset | 1.8V when active, 0V when reset | Active high |
| BO | Bootup/Bootstrap | Varies | Used during chip initialization |
Testing Procedure:
-
CLK line continuity: Using the multimeter in continuity mode, verify the CLK trace is continuous from the connector through each chip. A break in CLK will prevent all downstream chips from operating.
-
CI/RI chain integrity: The CI signal enters chip #0 and exits as the CI input for chip #1, forming a chain through all 129 chips. To identify a break:
- If the miner reports "chain find 0 ASIC" — the break is before chip #0 or the first chip is dead
- If it reports some chips detected (e.g., "chain find 80 ASIC" instead of 129) — the chain breaks at approximately chip #80. Use binary search (dichotomy) to narrow down
-
Binary search method for chain breaks:
- Probe the CI/RI signal at the midpoint chip (#64)
- If signal is present, the break is in chips #65–129
- If signal is absent, the break is in chips #0–64
- Continue halving until you identify the exact break point
-
RST line check: Verify the RST signal reaches all chips. An open RST line holds chips in reset permanently.
An oscilloscope makes signal chain testing much faster and more reliable. If you are performing regular hashboard repair, investing in even a basic 2-channel oscilloscope (such as the Rigol DS1054Z) is highly recommended. However, for most common failures, multimeter continuity testing of the chain is sufficient.
Step 4: Component-Level Diagnosis
Once you have identified the problem area (failed domain, broken chain segment, or specific symptoms), perform component-level testing.
4a: Individual Chip Testing
To test if a specific BM1368 chip is shorted:
- Set your multimeter to diode mode
- Place the black probe on the chip's ground pad
- Place the red probe on the chip's VDD (core voltage) pad
- A healthy chip shows a forward voltage drop of 0.3–0.6V
- A reading of 0V or very low (0.01–0.05V) indicates a shorted chip
- An OL (open) reading may indicate a lifted chip or broken solder joint
4b: Buck Converter Testing
Each voltage domain's buck converter consists of:
- A switching controller IC
- High-side and low-side MOSFETs
- An output inductor
- Input and output capacitors
To test the buck converter:
- Input voltage: Verify 12V reaches the converter input
- MOSFET gate signals: With an oscilloscope, verify the controller is producing switching signals
- Output inductor: Check continuity — an open inductor kills the domain
- Output capacitors: Check for shorts — a shorted capacitor pulls the domain to ground
4c: Capacitor Testing
Capacitors can fail short or open:
- Shorted capacitor: Domain voltage reads 0V, capacitor body may be discolored
- Open capacitor: Domain voltage may be noisy or slightly high
- Test by lifting one leg of the suspect capacitor and measuring the domain again
4d: Temperature Sensor Check
The S21 hashboard has temperature sensors (typically LM75A or similar) connected via I2C. If the miner reports "temperature too high" errors even at normal ambient:
- Check the I2C pull-up resistors (typically 4.7kΩ to 3.3V)
- Verify sensor continuity to the I2C bus
- A failed sensor may report incorrect readings causing the miner to throttle or shut down
Step 5: BM1368 Chip Replacement (BGA Rework)
The BM1368 uses a BGA (Ball Grid Array) package. Replacing a BGA chip requires a hot air rework station and proper technique.
BGA rework requires practice. If you have not performed BGA rework before, practice on scrap boards first. A botched rework can lift pads, bridge connections, or damage adjacent chips. See our Soldering Techniques Guide for detailed BGA rework instructions.
Removal Procedure:
-
Prepare the workspace: Ensure ESD protection is active. Place the hashboard on a preheater or heat-resistant mat.
-
Apply flux: Apply Amtech NC-559-V2 flux generously around the target chip. The flux prevents oxidation during heating and helps the solder flow cleanly.
-
Preheat the board: Set the preheater (or bottom heater of your rework station) to 150°C. Allow the board to soak at this temperature for 2–3 minutes. Preheating reduces thermal shock and prevents PCB warping.
-
Heat the chip: Using the hot air station with a nozzle sized to the BM1368 package:
- Air temperature: 350–380°C
- Airflow: Medium (40–50%)
- Hold the nozzle 5–10mm above the chip
- Move in a slow circular pattern
- The chip will release in approximately 60–90 seconds when the solder reaches reflow temperature (~230°C at the joint)
-
Remove the chip: When you see the solder become liquid (the chip may shift slightly), gently lift the chip straight up using vacuum tweezers or a suction tool. Do not twist or slide the chip — this will smear solder across adjacent pads.
-
Clean the pads: Apply fresh flux to the exposed pads and use solder wick with a soldering iron at 350°C to remove excess solder. The pads should be flat and clean with a thin, even solder coating.
Installation Procedure:
-
Inspect the new chip: Verify the BM1368 replacement chip has intact solder balls. If balls are missing or deformed, the chip needs reballing using a BGA stencil and solder paste.
-
Apply flux to the pads: A thin, even layer of flux on the cleaned PCB pads.
-
Align the chip: Place the BM1368 on the pads, aligning the orientation marker (dot or triangle) with the PCB marking. The BGA balls should sit centered on their respective pads.
-
Reflow: Repeat the preheating and hot air process:
- Preheat to 150°C (2–3 minutes soak)
- Hot air at 350–380°C until the solder reflows
- You will see the chip "snap" into alignment as the solder balls melt and surface tension centers the chip — this is called self-alignment
-
Cool down: Allow the board to cool gradually. Do not use compressed air to force-cool, as rapid thermal change can crack solder joints.
-
Clean flux residue: Use IPA and a brush to remove all flux residue around the replaced chip.
-
Inspect under magnification: Verify the chip is properly aligned, no solder bridges are visible between pads, and no flux residue remains trapped under the chip.
Step 6: Verification and Testing
After completing the repair, verify the fix before declaring success.
6a: Post-Repair Resistance Check
Repeat the domain resistance check from Step 2a:
- The repaired domain should read 2–10Ω
- If it still reads near 0Ω, the replacement chip may be defective or another chip in the domain is also shorted
- If it reads OL, verify the chip is properly soldered
6b: Powered Voltage Test
- Connect the hashboard to a test fixture or the miner
- Power on and measure all 12 domain voltages
- All domains should read 0.28–0.32V
- The repaired domain should match the others within ±0.02V
6c: Full Mining Test
- Reinstall the hashboard in the miner enclosure
- Connect all cables (18-pin data, power)
- Power on the miner and access the web dashboard
- Navigate to Miner Status → Hash Board
- Verify:
- All 129 chips are detected (ASIC count = 129)
- All chips report a valid frequency
- Hashrate is within expected range (~67 TH/s per board)
- No error messages in the kernel log
# SSH diagnostic commands to verify repair
# Connect via SSH (default user: root)
ssh root@<miner-ip>
# Check chip status
cat /tmp/freq.txt
# View kernel log for errors
dmesg | grep -i "chain\|asic\|error"
# Check real-time hashrate
cgminer-api summary6d: Burn-In Period
Run the repaired board for at least 24 hours before considering the repair complete:
- Monitor hashrate stability — it should not fluctuate more than ±5%
- Monitor temperature — all chips should be within the normal operating range (60–85°C)
- Check for error messages in the kernel log periodically
- A successful 24-hour burn-in confirms the repair is stable
Common Failure Patterns
| Symptom | Likely Cause | Diagnostic Step | Fix |
|---|---|---|---|
| "Chain find 0 ASIC" | Broken signal chain at chip #0, failed first chip, or connector issue | Step 3 — check CI/RI at first chip | Replace first chip or repair connector |
| Single domain at 0V | Shorted chip in domain pulling voltage to ground | Step 2a resistance check on that domain | Identify and replace shorted chip |
| All domains at 0V | No 12V input, failed input MOSFET, or blown input fuse | Check 12V at board input connector | Repair input power path |
| Hashrate 50% of expected | Half of domains failed or half of chips missing | Step 2b — check all domain voltages | Repair failed domains |
| Intermittent hashrate drops | Thermal cycling causing cracked solder joint, loose chip | Run thermal camera during operation | Reflow or replace affected chip |
| "Temp too high" error | Failed temperature sensor or blocked airflow | Step 4d — check temp sensors | Replace sensor or clean heatsink |
| Board detected but 0 hashrate | CLK signal missing, firmware issue | Step 3 — verify CLK with oscilloscope | Repair CLK trace or reflash firmware |
| Random chip drops during operation | Marginal solder joint, failing chip | Monitor chip count over time | Reflow suspicious chips or replace |
Troubleshooting FAQ
How many ASIC chips does the S21 hashboard have?
Each Antminer S21 hashboard contains 129 BM1368 ASIC chips organized into 12 voltage domains of 10–11 chips each. A fully functional S21 with 3 hashboards has 387 chips total.
What voltage should each domain read on the S21?
Each voltage domain should read approximately 0.30V (range: 0.28–0.32V). The I/O voltage rail should read 1.8V and the logic rail 3.3V.
Can I run the S21 with a missing chip?
Technically yes — if a single chip is removed (desoldered), the remaining chips in that domain will still operate but at reduced hashrate. However, this is not recommended as a permanent solution. The domain voltage will be slightly higher than normal to compensate.
What is the binary search (dichotomy) method for finding a bad chip?
Instead of testing all 129 chips individually, you measure the signal chain at the midpoint. If the signal is present, the fault is in the second half. If absent, it is in the first half. Repeat this halving process until you isolate the exact faulty chip. This reduces 129 tests to about 7.
Why does my S21 show "chain find 0 ASIC" on one board?
This means the control board cannot communicate with any chip on that hashboard. Common causes: bad 18-pin connector, broken CI signal trace near chip #0, failed first chip in the chain, or a shorted chip pulling the entire signal low. Start with connector inspection, then trace the CI signal from the connector to chip #0.
How long should BM1368 chip replacement take?
For an experienced technician, a single BGA chip replacement takes 15–30 minutes including removal, pad cleaning, placement, and reflow. Allow additional time for cooling and inspection. The full diagnostic process for a failed board typically takes 1–2 hours.
Can I use leaded solder for BM1368 rework?
Yes. Many professional repair technicians prefer 63/37 leaded solder (melting point ~183°C) over lead-free SAC305 (~217°C) because it flows at lower temperatures, reducing thermal stress on the PCB and adjacent components. However, be aware that mixing leaded and lead-free solder can create weak joints if temperatures are not managed correctly.
What is the expected lifespan of a properly repaired S21 hashboard?
A properly repaired hashboard should perform identically to a new one. The BM1368 chips themselves are rated for years of continuous operation. The most common failure recurrence is from inadequate flux cleaning (causes corrosion over time) or thermal paste degradation (requires replacement every 12–18 months under continuous operation).
Related Guides
- Antminer S21 Control Board Diagnostics — when the problem is the control board, not the hashboard
- Antminer S21 Power Supply Troubleshooting — PSU diagnostics for the APW17
- Antminer S21 Firmware Recovery — firmware flash and recovery procedures
- Antminer S21 Thermal Maintenance — thermal paste replacement and cooling
- Antminer T21 Hashboard Repair — the T21 uses the same BM1368 chip
- BM1368 ASIC Chip Reference — chip specifications and pinout
- Antminer Error Codes — complete error code reference
- ESD Safety Guide — protect your boards during repair
- Multimeter Testing Guide — essential measurement techniques