(originally here)
IPv6 deployment at Tier-2 sites
The status of the IPv6 deployment at WLCG Tier-2 sites is tracked here.
Mandate and goals
The imminent exhaustion of the IPv4 address space will eventually require to migrate the WLCG services to an IPv6 infrastructure, with a timeline heavily dependent on the needs of individual sites. For this reason the HEPiX IPv6 Working Group was created in April 2011 having this mandate.
The WLCG Operations Coordination and Commissioning Team has established an IPv6 Task Force to establish a close collaboration with the HEPiX IPv6 WG on these aspects (listed in chronological order):
- Define realistic IPv6 deployment scenarios for experiments and sites
- Maintain a complete list of clients, experiment services and middleware used by the LHC experiments and WLCG
- Identify contacts for each of the above and form a team of people to run tests
- Define readiness criteria and coordinate testing according to the most relevant use cases
- Recommend viable deployment scenarios
Scenarios
We can classify the actors in these categories:
Users
end users (human or robotic) using a client interface to interact with services
Jobs
user processes running on a batch node
Site services
services present at all sites (CE, SE, BDII, CVMFS, ARGUS, etc.)
Central services
services presents at a few sites (VOMS, MyProxy, Frontier, Nagios, etc.)
The following table describes the requirements of the corresponding nodes in terms of IP protocol in a timescale limited to a few years from now.
Node | Network | Requirement |
---|---|---|
User | IPv4 | MUST work, as users can connect from anywhere |
User | IPv6 | SHOULD work, but it would concern only very few users |
User | dual stack | MUST work, it should be the most common case in a few years |
Batch | IPv4 | MUST work, as some batch systems might not work on IPv6, or e.g. the site might want to use AFS internally |
Batch | IPv6 | MUST work, as some sites might exceed their IPv4 allocation otherwise |
Batch | dual stack | MUST work, as some sites might want to use legacy software but also be fully IPv6-ready (e.g. CERN) |
Site service | IPv4 | SHOULD work, as many institutes will not adopt IPv6 for some years, but they are strongly encouraged to do it |
Site service | IPv6 | SHOULD work, but it would concern only very few sites |
Site service | dual stack | MUST work, it should be the most common case in a few years |
Central service | IPv4 | MAY work, but central services can be expected to run at sites with an IPv6 infrastructure |
Central service | IPv6 | MAY work, as above sites certainly have an IPv4 infrastructure |
Central service | dual stack | MUST work, and all above sites are expected to be able to provide dual-stack nodes |
Eventually, everything MUST work with only IPv6, but it will be many years from now.
Concerning storage federations, where any batch node can access data on any SE, it's evident that ALL SEs MUST be in dual stack.
Even more in general, all services that interact with users MUST work in dual stack, as user nodes may use either protocol.
To summarise, these are the implications on testing:
- central services MUST be tested in dual stack node
- site services MUST be tested in dual stack mode
- user nodes MUST be tested in IPv4 and dual stack mode
- batch nodes MUST be tested in IPv4, IPv6 and dual stack mode (not all three configurations might be possible for a given site, though)
From now on all services are assumed to run on dual stack nodes. Moreover, when testing on a dual stack testbed, tests need to be run by forcing IPv4 or IPv6 either on the client node.
Use cases to test
Basic job submission
The user submits a job using the native middleware clients (CREAM client, Condor-G, etc.) or intermediate services (gLite WMS, glideinWMS, PanDA, DIRAC, AliEN, etc.).
User | CE | Batch | Notes |
---|---|---|---|
IPv4 | dual stack | IPv4 | |
IPv4 | dual stack | dual stack | |
IPv4 | dual stack | IPv6 | |
dual stack | dual stack | IPv4 | also forcing IPv6 on user node |
dual stack | dual stack | dual stack | also forcing IPv6 on user node |
dual stack | dual stack | IPv6 | also forcing IPv6 on user node |
All "auxiliary" services (ARGUS, VOMS, MyProxy, etc.) are supposed to work on dual stack, but may run on IPv4 initially for practical purposes, to avoid having a full dual-stack service stack right from the beginning. This remark is totally general and applies to all tests described below.
In case of intermediate services, the tests become much more complex given the higher number of services involved.
Basic data transfer
The user copies a file from his node to a SE and back.
User | SE | Notes |
---|---|---|
IPv4 | dual stack | |
dual stack | dual stack | also forcing IPv6 on user node |
In this context, a batch node reading/writing to a local or remote SE is treated as a user node. The file copy MUST be tried with all protocols supported by the SE.
Third party data transfer
The user replicates a bunch of files between sites via FTS-3.
User | SEs (source, destination) | FTS-3 | Notes |
---|---|---|---|
IPv4 | dual stack | dual stack | |
dual stack | dual stack | dual stack |
Production data transfer
The user replicates a dataset using experiment-level tools (PhEDEx, DDM, DIRAC, etc.).
User | SEs (source, destination) | FTS-3 | Experiment tool | Notes |
---|---|---|---|---|
IPv4 | dual stack | dual stack | dual stack | |
dual stack | dual stack | dual stack | dual stack |
Conditions data
A job access conditions data from a batch node via Frontier/squid.
Batch | squid | Frontier | Notes |
---|---|---|---|
IPv4 | dual stack | dual stack | |
IPv6 | dual stack | dual stack | |
dual stack | dual stack | dual stack |
Experiment software
A job accesses experiment software in CVMFS from a batch node.
Batch | squid | Stratum0/1 | Notes |
---|---|---|---|
IPv4 | dual stack | dual stack | |
IPv6 | dual stack | dual stack | |
dual stack | dual stack | dual stack |
Experiment workflow
A user runs a real workflow (event generation, simulation, reprocessing, analysis).
This test combines all previous tests into one.
Information system
A user queries the information system
User | BDII | Notes |
---|---|---|
IPv4 | dual stack | |
dual stack | dual stack |
Job monitoring
Monitoring information from jobs, coming either from central services or from batch nodes via messaging systems, is collected, stored and accessed by a user.
User | Monitoring server | Messaging system | Batch | Notes |
---|---|---|---|---|
IPv4 | dual stack | dual stack | IPv4 | |
IPv4 | dual stack | dual stack | IPv6 | |
IPv4 | dual stack | dual stack | dual stack | |
dual stack | dual stack | dual stack | IPv4 | |
dual stack | dual stack | dual stack | IPv6 | |
dual stack | dual stack | dual stack | dual stack |
IPv6 compliance of WLCG services
AliEN
ARC
ARGUS
BDII
- Contact: Maria Alandes
- Status: BDII is IPv6 compliant since the EMI 2 release. (OpenLDAP OK since v2)
- Further info on the investigation here: https://savannah.cern.ch/bugs/index.php?95839
- In order to enable the IPv6 interface, a yaim variable BDII_IPV6_SUPPORT needs to be set to 'yes' (default is no). This is all described in the sys admin guide and the corresponding release notes.
BestMAN
CASTOR
cfengine
CMS Tag Collector
cmsweb
CREAM CE
CVMFS
Dashboard Google Earth
dCache
DIRAC
DPM
- Contact: Fabrizio Furano
- Status:
- SRM and rfio need a config workaround
- Glasgow are evaluating DPM on IPv6 as part of their Hepix involvement (Sam Skipsey). Quite a lot is working, details should come from Sam.
- Dependencies:
- MySQL 5.5 (min version for IPv6) not available for SL6, DPM has been successfully deployed with MariaDB (IPv6 compliant), works out of the box.
- xrootd frontend - awaiting v4
- apache >=2.2 (used for HTTP/DAV interface) supports ipv6
EGI Accounting Portal
EOS
Experiment Dashboards
Frontier
FTS
- Contact: Michail Salichos for both FTS2 and FTS3
- FTS2 Status:
- Francesco Prelz has checked FTS2 in the past.
- Should be OK, except for the Globus issue https://ggus.eu/ws/ticket_info.php?ticket=86101
- By default, would not fix this if broken.
- FTS3 status:
- looks good for FTS3 and its dependencies (modulo the globus issue mentioned above)
- with the exception of Active MQ-cpp - the messaging side will need some attention for IPv6 support.
Ganglia
GFAL/lcg_util
- Contact:
- GFAL2: Adrien Devresse
- gfal/lcg_util: Alejandro Alvarez Ayllon
- Status: gfal2 is plugin based, so it all depends on the plugin
- HTTP: neon supports IPv6
- SRM: gsoap supports IPv6
- GridFTP: it is enabled - https://its.cern.ch/jira/browse/LCGUTIL-4
- DCAP: unknown
- LFC and RFIO: should work
- BDII: OpenLDAP does support ipv6
- Note on gsoap - this is used for WS in a number of cases. It supports IPV6 but this has to be enabled at compile time, so in certain builds it could be missing.
- Note: gfal/lcg_util is probably OK, not tested, and by default would not fix this if broken.
glideinWMS
GOCDB
Gratia Accounting
Gridsite
GridView
Gstat
iCMS
LFC
- See details for DPM
MonALISA
MyOSG
MyProxy
MyWLCG
Nagios
OpenAFS
PanDA
perfSONAR
PhEDEx
REBUS
SAM
Scientific Linux
STD IB and QA pages
StoRM
Ticket system (GGUS)
various D web tools
VOMS
gLite WMS
xroot