Presentation is loading. Please wait.

Presentation is loading. Please wait.

OFED for Linux: Status and Next Steps www.openfabrics.org 1 Betsy Zeller (Qlogic), Tziporet Koren (Mellanox) 3/16/2010.

Similar presentations


Presentation on theme: "OFED for Linux: Status and Next Steps www.openfabrics.org 1 Betsy Zeller (Qlogic), Tziporet Koren (Mellanox) 3/16/2010."— Presentation transcript:

1 OFED for Linux: Status and Next Steps www.openfabrics.org 1 Betsy Zeller (Qlogic), Tziporet Koren (Mellanox) 3/16/2010

2 Agenda Linux OFED components Releases completed in 2009: –OFED 1.4.1 & 1.4.2 –OFED 1.5 Releases planned for 2010: –OFED 1.5.1 –OFED 1.5.2/3 Future releases: –OFED 1.6 and beyond How can you contribute? Questions and discussion –Planning for scalability and discussion on MPI inclusion 2 www.openfabrics.org

3 Linux OFED Components HCA/NIC Drivers –IB: IBM, Mellanox, QLogic –iWARP: Chelsio, Intel –RoCEE: Mellanox Core: Verbs, mad, SMA, CM, CMA IPoIB SDP SRP and SRP Target iSER and iSER Target RDS NFS-RDMA Qlogic_VNIC uDAPL OSM Diagnostic tools  Bonding module  Open iSCSI  MPI Components  MVAPICH  Open MPI  MVAPICH2  Benchmark tests  Proprietary MPIs: Intel, Platform MPI  Proprietary SMs: Sun, Voltaire, Qlogic, Mellanox OFA Development Add on Tested with 3

4 OFED 1.4.1 Released on June 3, 2009 Distro integration: –Integrated into RHEL 5.4 –SLES 11 New features –Added support for RHEL 5.3 and SLES11 –NFS/RDMA: In beta quality with support for RHEL 5.2, 5.3 and SLES 10 SP2 –Updated MPI packages: MVAPICH 1.1.0-3355, Open MPI 1.3.2 –Updated bonding package: ib-bonding-0.9.0-40 –Updated DAPL: compat-dapl-1.2.14 and dapl-2.0.19 –Updated OpenSM version to include critical bug fixes –Fixed RDS iWARP support –Low level drivers updated: ehca, mlx4, cxgb3, nes, ipath, mthca –Added a module parameter to control number of MTTs per segment in Mellanox HCAs –mstflint update –Enhanced OpenSM and management tools, user interface, HA, routing enhancements, and more Used in Intel ® Cluster Ready Solutions 4

5 OFED 1.4.2 Released on Aug 6, 2009 –Critical bug fixes, including: –Fixes to NES (Intel iWarp) driver –Fixes to support running with Lustre installed –NFS/RDMA critical bug fixes (still technology preview) Distro integration: –RHEL 5.5 1.4.2+ –SLES 11 SP1 1.4.2 (kernel from 2.6.32) Passed Oracle 11g certification with RDS www.openfabrics.org 5

6 OFED 1.5 Released on December 30, 2009 –Kernel base: 2.6.30 To be included in RHEL 5.6 Used in Intel ® Cluster Ready Solutions List of Supported Kernels for OFED 1.5 –RHEL4: up6, up7, up8 –RHEL5: up2, up3, up4 –SLES10: SP2, SP3 –SLES 11 –Fedora Core 11* –OpenSuSE 11* –Kernel.org: 2.6.18-2.6.32* * minimal QA for these versions. www.openfabrics.org 6

7 OFED 1.5 – New features New OSes: RedHat EL5.4, EL4.8 and SLES 10 SP 3 Kernel Base: 2.6.30 –New supported kernels: 2.6.29-2.6.32 –Forward port to support kernels above 2.6.30 Hardware driver (qib) for new Qlogic QDR HCA Added ConnectX2 support to mlx4 driver User space packages as tar balls for easier distro integration –All libraries available under: http://www.openfabrics.org/downloads/http://www.openfabrics.org/downloads/ SDP: – Zero Copy (beta); more performance improvements Bonding: Included only for older distros that does not include the IB support –RHEL 5.3 / SLES 10 SP2 or older version 7

8 1.5 – New features – Cont. uDAPL: –Scalability enhancements –New UCM provider –Common code base with WinOF 2.1. MPI updates: –OSU MVAPICH 1.2.0 –Open MPI 1.4 –OSU MVAPICH2 1.4 –MPI tests 3.2 NFS-RDMA in beta Many bug fixes www.openfabrics.org 8

9 1.5 – Open SM new features Scalability & performance Optimized SL2VL setup Parallel LFTs setup Parallel MFTs setup Routing & multicast FTree improvements Routing engine reloading Mesh switch reordering optimizations MGID to MLID compression QoS improvements  SL2VL setup optimization  QoS/LASH co-exist Major bug fix  MCG join/leave fixes  Clean delayed MCG deletion 9

10 OFED 1.5.1 To be released within the next week Main new features: –Added RoCEE support –Updated Open MPI to rev 1.4.2 –Updated MVAPICH2 to rev 1.4-2 –Updated DAPL to rev 2.0.26 –Added extended atomic operations to ConnectX (kernel) –Improved NFS/RDMA stability Need more testing –Added IPv6 support to RDMACM –Bug fixes www.openfabrics.org 10

11 OFED 2010 Plans Proposal: –Provide additional dot releases throughout the year 1.5.2 – Jun 1.5.3 – Sep –Delay 1.6, and changing the kernel base, till Q1 2011 Pros: –Improve the stability of OFED 1.5.x, faster bug fix turnaround –Integrate support for new distros as they become available –Logo program: Vendors will be able to fix issues found Cons: –Many patches in the fixes directory Opinions? … www.openfabrics.org 11

12 OFED 1.5.2 New Features Add new OSes: RHEL 5.5, SLES11 SP1 Update the management package Add-on packages that does not touch the core New low level drivers for new HW Critical bug fixes Schedule: June/July –Decide in EWG meetings * OFED 1.5.3 – will decide if needed after 1.5.2 www.openfabrics.org 12

13 OFED 1.6 New Features kernel base 2.6.35 or 36 Virtualization and SRIOV support Mellanox Vnic/vHBA for BridgeX New HW from vendors (if any) New scalability features (if any) Soft-RoCEE & Soft-iWarp drivers –If ready can go into 1.5.x before New management features –3D torus support by OpenSM MMU notifier for MPI –Solution that will be accepted by the kernel Improve connection management scheme? More features – Based on input from customers www.openfabrics.org 13

14 OFED 1.6 Schedule & OS support OS support reduction –Stop supporting old OSes (e.g. RHEL 4.x, SLES 10) Schedule suggestion: –Start to work during Q4 2010 –Release in Q1 2011 Opinions? www.openfabrics.org 14

15 Reminder: How do you contribute? Develop new code and features Bug fixes Performance tuning Contribute backports for new OSes and forward ports for new kernels QA and testing Send patches and comments to the mailing lists: –ewg@lists.openfabrics.org – OFED for Linux specific onlyewg@lists.openfabrics.org –linux-rdma@vger.kernel.org – General Linux developmentlinux-rdma@vger.kernel.org Open bugs in Bugzilla (https://bugs.openfabrics.org/)https://bugs.openfabrics.org/ –Choose OpenFabrics Linux when opening a new bug –Test old bugs with new releases and update on bugzilla Participate in EWG bi-weekly meetings 15

16 Open Discussion Should we have a scalability roadmap for OFED? How do we plan for scalability? Should we continue to include MPIs in the OFED releases? What’s the next step on MMU notifier kernel support for MPI? Questions and feedback from the community 16

17 Scalability Challenges Scale out to 10K-20K or more nodes –Performance –Reliability –Subnet management Focus additional attention/resources on issues –Should we have a BOF on this? www.openfabrics.org 17

18 Infrastructure Scalability/Features Improve multicore affinity/awareness Multicast Flow control for SRQ Fault tolerance IB monitoring –Performance counters, throughput, hotspots, degraded links Adaptive routing –HCA out-of-order delivery –Switch logic for state info & adaptive algorithm, etc. Path Record Support RDMA CM 18

19 MPI Distribution in OFED: Rationale Open source MPI’s initially included to “bootstrap” the OFED project MPI was the main user for OFED, so this seemed like a natural pairing –Made it (significantly) easier for customers to get their MPI jobs running on InfiniBand –However, distros don’t like it – they package their own QA testing of MPI + OFED is still extremely valuable –This is not a discussion of removing MPI + OFED QA How many use MPI from OFED, vs compiling own? 19

20 MPI Distribution in OFED: Pros MPI is still the most common OFED “customer” HPC customers get network stack + MPI in one package –Helps rapid MPI deployment on new clusters (out-of-box) –MPI-selector function allows to select MPI stack of choice during the installation Customers get QA assurance of specific MPI + OFED version tuples Helps to test multiple functionalities of the OFED stack and IB/iWARP fabric with comprehensive suite of MPI-level benchmarks 20

21 MPI Distribution in OFED: Cons MPI’s have their own QA cycles –MPI+OFED QA testing is more for OFED, not MPI Bundling induces project scheduling difficulties between OFED and various MPI packages RedHat and SuSE both say “Don’t do this!” –They both already include the open source MPI’s –Makes it more difficult for them to take OFED drops Many users will download the latest-n-greatest MPIs anyway – not the ones included in OFED 21

22 User Level MMU Notification Userlevel registration caches need to be notified when registered memory is freed –Problem for middleware that hides memory registration –For example: MPI Roland prototyped “ummunotify” –Jeff Sq. adapted Open MPI to use ummunotify –…but ummunotify was rejected by the kernel maintainers –KM’s want same functionality on performance counters MPI’s still need this functionality –Roland not available; Jeff Sq. will do the MPI work –Who can do the kernel side? www.openfabrics.org 22

23 Questions and feedback from the community www.openfabrics.org 23

24 Thank You! www.openfabrics.org 24


Download ppt "OFED for Linux: Status and Next Steps www.openfabrics.org 1 Betsy Zeller (Qlogic), Tziporet Koren (Mellanox) 3/16/2010."

Similar presentations


Ads by Google