ASC23: Meet the Teams

By Dan Olds

 

The ASC23 cluster competition was held in a basketball stadium on the campus of the University of Science and Technology of China, located in Hefei, China – a modest Chinese town of nine million.

The competition nearly filled up the floor of the stadium but gave the student teams a comfortable amount of room for their work tables and systems. With the size of the room, there wasn’t any problem with keeping the temperatures at a comfortable level, but with all of the systems running, the sound level was incredible.

We interviewed nearly all the teams and, aided greatly by our trusty translators, we were able to talk about their configuration choices, the competition, and the challenges they anticipated they’d be facing. We put in a couple of long days filming, but it was a lot of fun to meet the students and learn more about them.

Here are the teams we interviewed at the stadium:

Fuzhou University: This is the fourth ASC competition for Fuzhou, they previously competed at ASC18, ASC19, and ASC21. Their final configuration consisted of three nodes and six GPUs. We processed this one in black and white because our camera exposure was set way too high. So kind of a noir look on this one.

 

 

Jinan University: They are the defending champion from ASC21, but also participated at ASC19, ISC21 (scoring Bronze), and SC21. Sporting a three node, six GPU, cluster, they’re looking to repeat their success from the last ASC competition.

 

 

Lanzhou University: Competing in their second ASC event, Lanzhou is looking to drive their three node, six GPU cluster to glory. Well, no, they didn’t exactly say that, they basically said that they’d try to do their best, but I can read between the lines.

 

 

Peking University: This is the eighth cluster competition for Peking University and their third ASC appearance. They’re looking to unseat their Beijing cross-town rival Tsinghua University and take home some trophy hardware. The team is driving a cluster that is a departure from the three node, six GPU configurations we’ve been seeing so far. The Peking cluster is three nodes, accompanied by nine GPUs, which should certainly give them more performance on several of the tasks – but only if they can precisely control the power draw and stay under the 3,000 watt power cap.

 

 

Qilu Iniversity of Technology: In an extraordinarily washed out video, we talk to the team from Qilu University of Technology. They’re a first time competitor, but don’t seem intimidated by pressure at all. The team has configured a four node, eight GPU cluster which is significantly larger than the other competitors. Maybe this will give them the edge they need to make some waves at ASC23.

 

 

Qinghai University: Qinghai is making their third ASC appearance and is driving a three node, four GPU cluster. In the video, we interview the team leader, who discusses pre-competition nerves and the chance that they’re under-powered when it comes to hardware. The team might move to a dual-node, four GPU configuration, which should allow them to run full out on the applications but might not get them to the performance they need to triumph.

 

 

Shanghai Jiao Tong University: This is the 12th cluster competition appearance for Shanghai Jiao Tong University, making them one of the most experienced institutions in the competition. They have won three Silver medals, plus a Bronze medal in previous competitions, but haven’t brought home a trophy since ASC18. The team believes that the most difficult part of the challenge for them will be the AI-centric tasks.

 

 

Shanghai University: Another first-time competitor in student cluster competitions, Shanghai University has a mountain to climb. Not only are they here for the first time, but they’re outgunned on the hardware side with only four GPUs when most of the other teams are sporting six. To compensate, they’re running four nodes to give them a bit more CPU power, which should help on some of the applications. Which will help more is that some of their team members have had some real world HPC experience, which is something that most of the other teams lack. We’ll see if it pays off.

 

 

ShanghaiTech University: This is the seventh competition for the team from ShanghaiTech University. The team nailed down a Silver medal in their first competition at ASC18. They’re driving four nodes and eight GPUs. At the time of the taping, the team was trying to see if they could go with directly connecting the nodes together and thus be able to devote some extra power (as much as 300 watts) to their compute components. But in order to accomplish this, they’ll need to get their hands on some dual-port IB NICs, which, at this point, doesn’t seem to be in the cards. Gotta like the innovative thinking, right?

 

 

Shanxi University: Third time competitor Shanxi has settled on a four node, eight GPU, cluster after experimenting with several other configurations. Their biggest challenge in their mind will be the YLLM training task, which will require them to build a language model with 17.88 billion tokens. In the video, we discuss the challenges of power management and how important it is to practice their power management techniques before the competition begins.

 

 

Southern University of Science and Technology: At the time of filming, the team captain is uneasy about the status of their server. They spent a lot of time correcting a network and GPU configuration error, plus even more time getting their Spack packages up and running. They were relying on the ability to download the SW they needed off the web, but ran into trouble finding the specific  packages they needed. Ouch. However, they’re recovering well, as you’d expect from a team that has competed five times before. Good luck, SUSTech.

 

 

Taiyuan University of Science & Technology: This is the seventh competition for the team from Taiyuan. When we caught up with them, they were comfortable with their progress and felt that they had everything under control. They have a bit of a hurdle in the competition as they only have two nodes and four GPUs, which could be a little underpowered compared to the rest of the field.

 

 

Tsinghua University: This university has competed in a record 25 student cluster competitions world-wide, has won 13 Gold medals, plus six Silver medals and three Bronze. In other words, they know their way around a student cluster competition. However, this is a brand-new slate of students, which means you can throw the record book out the window. The team was originally planning to run four nodes and eight GPUs but ended up with a configuration of three nodes and six GPUs. In the video, we talk to the team leader about team preparation and their unique approach to workload management. Rather than strictly split up task and responsibilities between team members, this edition of Team Tsinghua has decided that everyone is going to work on everything – often at the same time.

 

 

University of Science & Technology of China: This team is representing the host institution, USTC, and is located in Hefei, China, on a beautiful campus. Team USTC has competed in nine previous events, taking home Silver and Bronze awards from the ISC 2014 and 2015 competitions. At the time of filming, the team is driving a cluster with four nodes and eight GPUs. The team believes that their biggest challenge will be to control the power draw of their cluster, which is certainly true.

 

 

Zhejiang University: At their sixth competition, Team Zhejiang radically departed from the rest of the field in their hardware choice:  a single node attached to a PCIe expansion box with eight GPUs. Purists will argue that this isn’t really a cluster, since it’s only a single node, but they’re here and competing, so let’s see what happens. While this config will scream on LINPACK, which is probably what the team is gunning for, it’s doubtful that it can adequately perform on the other, more CPU-centric, applications.  Zhejiang has successfully captured the LINPACK crown before with their “Suicide LINPACK” at ASC 2016 (they turned off all their fans, crossed their fingers, and ran a scorching fast HPL). With this system, they have to be the favorite to win HPL again this year.

 

 

That’s it for the in-stadium competition, but there’s also a virtual competition (with the same apps but utilizing AWS hardware) that features four non-mainland China teams, let’s meet them….

The Chinese University of Hong Kong: We did a quick virtual interview with this school. This is the fourth time the university had entered a team in the ASC competition, but the first time for these students.

 

 

Kasetsart University: The pride of Thailand, this is the fifth outing for the team from Kasestsart U. In the interview, we meet all of the team members and talk about their responsibilities in the competition. This is an entirely new team for Kasetsart, but they seem enthusiastic and ready for the fight. Using a cloud is also a new experience for the students, not to mention running HPC applications. So a lot of new experiences are ahead for the Kasetsart team.

 

 

National Tsing Hua University: NTHU has competed in a whopping 20 previous student cluster competitions, including the very first one at SC07. Over the years, the team has collected a lot of awards including four Gold medals, two Silver medals, two Bronze awards, and three Highest LINPACK titles. In our interview, the team says that they are ready for the upcoming challenges. In their minds, the most difficult application will be the YLLM (large language model) due to the size of the model. But this is an experienced team, with veterans from their ASC/ISC 2021 and championship SC2022 teams.

 

 

Universid ad EAFIT: This is the ninth student cluster competition appearance for the team from Colombia, albeit with new members. In addition to learning the applications, learning HPC, and learning how to navigate the cloud, the team also must contend with an 11-hour time zone difference. Yikes. In the interview, we discuss the competition, their experience in HPC, and why only two members of the team are named Santiago (they could have had more, I think). Since this is their first look at HPC, they feel they’ve had a slow start at getting familiar with the applications. But that’s a common story in student cluster competitions.

Contact Us
Technical Support Yu Liu techsupport@asc-events.org
Media Jie He media@asc-events.org
Collaboration Vangel Bojaxhi executive.director@asc-events.org
General Information info@asc-events.org

 

Partners      Follow us
Copyright 2020 Asia Supercomputer Community. All Rights Reserved