Abstract: Recent advances in Large Language Models (LLMs), such as GPT and LLaMA, have demonstrated remarkable capabilities across a wide array of natural language processing tasks. Despite these ...
# Parallelism strategy comparison: same model + hardware, different EP/TP/DP split. # All configs use 8 GPUs on A100+Slingshot with two-tier networking. The traces # capture measured compute time, so ...
# ./run_megatron_mimo_parallelism_tests.sh --gpus 4 # Run all configs with 4 GPUs # ./run_megatron_mimo_parallelism_tests.sh --config tp2_both # Run only tp2_both config ...