Skip to content
Federico Stagni edited this page Jan 18, 2024 · 24 revisions

General

The concept of "Setup" is in the process of disappearing. The concept (which is explained in https://dirac.readthedocs.io/en/latest/AdministratorGuide/Introduction/diraccomponents.html) has a long history and effectively enabled the possibility to use one single machine as a server for multiple setups/installations (e.g. production and testing setups on the same node). This possibility is recognized now as being not useful. Therefore, this functionality is being removed.

Core/Framework

use InstalledComponentsDB
ALTER TABLE `HostLogging` DROP COLUMN `Setup`;

Accounting

PR https://github.com/DIRACGrid/DIRAC/pull/6565/ makes several simplifications to the Accounting system, and removes the concept of "Setup". This concept has always been reflected in the MySQL database table names, which need to be updated. In order to do so:

  1. Stop inserting of records, e.g. by stopping the Accounting/DataStore service(s)
  2. You will have several tables for which the name has to be altered. The following selection will print out the SQL command that you will need to issue to have things done (beware to replace "DIRAC-Certification" with the name of your setup).
SET group_concat_max_len=5000;SELECT group_concat(v.name separator '; ')
 FROM (
     SELECT concat('RENAME TABLE `', t.table_name, '` TO `', replace(t.table_name, '_DIRAC-Certification_', '_'), '`') name
     FROM information_schema.tables t
     WHERE table_name like '%_DIRAC-Certification_%'
 ) v;

(you might need to run the above more than once).

And then:

DELETE FROM `ac_catalog_Types` where name LIKE 'DIRAC-Certification%'

(again, replace 'DIRAC-Certification%' with the name of your setup).

  1. Restart the DataStore service(s)

Transformations

With the changes to the permission enforcement for the Transformation one has to make sure that the Hosts running TransformationSystem Agents have the ProductionManagement property. The same is true for a shifterProxy and groups that are used for Transformations

Remove of DIRACSetup and OwnerDN in WMS and Transformations

Following changes in PRs https://github.com/DIRACGrid/DIRAC/pull/6164, https://github.com/DIRACGrid/DIRAC/pull/6566, https://github.com/DIRACGrid/DIRAC/pull/7157, https://github.com/DIRACGrid/DIRAC/pull/7124:

  1. Stop job submission, e.g. by stopping the JobManager service(s)
  2. Stop pilots submission, e.g. by stopping the SiteDirector agent(s)
  3. Stop transformations submission, e.g. by stopping the TransformationManager service(s)
  4. Alter the following tables:
use JobDB;
ALTER TABLE `Jobs` DROP COLUMN `DIRACSetup`;
ALTER TABLE `Jobs` DROP COLUMN `OwnerDN`;
use PilotAgentsDB;
ALTER TABLE `PilotAgents` DROP COLUMN `GridRequirements`;
ALTER TABLE `PilotAgents` DROP COLUMN `OwnerDN`;
ALTER TABLE `PilotAgents` DROP COLUMN `ParentID`;
ALTER TABLE `PilotAgents` CHANGE `OwnerGroup` `VO` VARCHAR(128);
use SandboxMetadataDB;
ALTER TABLE `sb_EntityMapping` DROP COLUMN `EntitySetup`;
ALTER TABLE `sb_Owners` DROP COLUMN `OwnerDN`;
use TaskQueueDB;
ALTER TABLE `tq_TaskQueues` DROP COLUMN `Setup`;
ALTER TABLE `tq_TaskQueues` ADD COLUMN `Owner` VARCHAR(255) NOT NULL;
ALTER TABLE `tq_TaskQueues` MODIFY COLUMN `OwnerDN` VARCHAR(255);
use TransformationDB;
ALTER TABLE `Transformations` ADD COLUMN `Author` VARCHAR(255) NOT NULL;
ALTER TABLE `Transformations` MODIFY COLUMN `AuthorDN` VARCHAR(255) DEFAULT NULL;
  1. Save the following script in a (whatever, e.g. in /opt/dirac) directory of a DIRAC server machine:
from DIRAC.Core.Base.Script import parseCommandLine
parseCommandLine()

from DIRAC.ConfigurationSystem.Client.Helpers.Registry import getUsernameForDN
from DIRAC.TransformationSystem.DB.TransformationDB import TransformationDB
from DIRAC.WorkloadManagementSystem.DB.TaskQueueDB import TaskQueueDB

tqDB = TaskQueueDB()
tsDB = TransformationDB()

dnToOwner = {}

# queries

# TaskQueues
res = tqDB._query("SELECT DISTINCT OwnerDN from `tq_TaskQueues`")
if not res["OK"]:
    print(f"ERROR -- {res['Message']}")
    exit(1)

tqOwnerDNs = []
if not res["Value"]:
    print("Nothing to update in tq_TaskQueues")
else:
    tqOwnerDNs = [t[0] for t in res["Value"]]

# Transformations
res = tsDB._query("SELECT DISTINCT AuthorDN from `Transformations`")
if not res["OK"]:
    print(f"ERROR -- {res['Message']}")
    exit(1)

tsOwnerDNs = []
if not res["Value"]:
    print("Nothing to update in Transformations")
else:
    tsOwnerDNs = [t[0] for t in res["Value"]]

# get the owners
for ownerDN in set(tqOwnerDNs).union(tsOwnerDNs):
    res = getUsernameForDN(ownerDN)
    if not res["OK"]:
        print(f"ERROR -- {res['Message']}")
        if res["Message"].startswith("No username found for dn"):
            owner = "unknown"
        else:
            exit(1)
    else:
        owner = res["Value"]
    dnToOwner[ownerDN] = owner

# updates
for ownerDN, owner in dnToOwner.items():
    # TaskQueues
    query = f"UPDATE `tq_TaskQueues` SET Owner = '{owner}' WHERE OwnerDN = '{ownerDN}'"
    res = tqDB._query(query)
    if not res["OK"]:
        print(f"ERROR -- {res['Message']}")
        exit(1)
    # Transformations
    query = f"UPDATE `Transformations` SET Author = '{owner}' WHERE AuthorDN = '{ownerDN}'"
    res = tsDB._query(query)
    if not res["OK"]:
        print(f"ERROR -- {res['Message']}")
        exit(1)
    # PilotAgents

print("Done!")

Then run with python script_name.py -o /DIRAC/Security/UseServerCertificate=yes

  1. Restart the JobManager service(s)
  2. Restart the TransformationManager service(s)

And also (this table should be empty by now):

use TaskQueueDB;
DROP TABLE `tq_TQToSubmitPools`

Added index

Following https://github.com/DIRACGrid/DIRAC/issues/7335

ALTER TABLE `Transformations` ADD INDEX `status_index` (`Status`), ADD INDEX `type_index` (`Type`), ALGORITHM=INPLACE;

Make TaskQueueDB VO-aware

  1. Before updating to 8.1, create the VO column:
use TaskQueueDB;
ALTER TABLE tq_TaskQueues ADD COLUMN VO VARCHAR(64);
  1. One the WMS service is updated to 8.1, the following script can be ran:
import MySQLdb
from DIRAC.ConfigurationSystem.Client.Helpers.Registry import getVOForGroup

# MySQL database connection parameters (those values are for the CI environment)
db_host = "mysql"
db_user = "Dirac"
db_password = "Dirac"
db_name = "TaskQueueDB"

# Create a connection to the MySQL database
db = MySQLdb.connect(host=db_host, user=db_user, passwd=db_password, db=db_name)
cursor = db.cursor()

# Fetch data from the table and update the 'VO' column
try:
    cursor.execute("SELECT DISTINCT OwnerGroup FROM tq_TaskQueues")
    rows = cursor.fetchall()
    for row in rows:
        owner_group = row[0]
        vo = getVOForGroup(owner_group)
        if not vo:
            db.rollback()
            raise Exception(f"VO not found for OwnerGroup '{owner_group}'")
        update_query = "UPDATE tq_TaskQueues SET VO = %s WHERE OwnerGroup = %s"
        cursor.execute(update_query, (vo, owner_group))
    db.commit()
    print("Updated 'VO' column with values from OwnerGroup.")
except MySQLdb.Error as e:
    db.rollback()
    print(f"Error updating 'VO' column: {e}")

# Set the 'VO' column to be NOT NULL
try:
    cursor.execute("ALTER TABLE tq_TaskQueues MODIFY COLUMN VO VARCHAR(64) NOT NULL")
    db.commit()
    print("Set 'VO' column to NOT NULL.")
except MySQLdb.Error as e:
    db.rollback()
    print(f"Error setting 'VO' column to NOT NULL: {e}")


# Close the database connection
db.close()

Make JobDB VO-aware

  1. Before updating to 8.1, create the VO column:
use JobDB;
ALTER TABLE Jobs ADD COLUMN VO VARCHAR(64) DEFAULT 'Unknown';
  1. One the WMS service is updated to 8.1, the following script can be ran:
import MySQLdb
from DIRAC.ConfigurationSystem.Client.Helpers.Registry import getVOForGroup

# MySQL database connection parameters (those values are for the CI environment)
db_host = "mysql"
db_user = "Dirac"
db_password = "Dirac"
db_name = "JobDB"

# Create a connection to the MySQL database
db = MySQLdb.connect(host=db_host, user=db_user, passwd=db_password, db=db_name)
cursor = db.cursor()

# Fetch data from the table and update the 'VO' column
try:
    cursor.execute("SELECT DISTINCT OwnerGroup FROM Jobs")
    rows = cursor.fetchall()
    for row in rows:
        owner_group = row[0]
        vo = getVOForGroup(owner_group)
        if not vo:
            db.rollback()
            raise Exception(f"VO not found for OwnerGroup '{owner_group}'")
        update_query = "UPDATE tq_TaskQueues SET VO = %s WHERE OwnerGroup = %s"
        cursor.execute(update_query, (vo, owner_group))
    db.commit()
    print("Updated 'VO' column with values from OwnerGroup.")
except MySQLdb.Error as e:
    db.rollback()
    print(f"Error updating 'VO' column: {e}")

# Close the database connection
db.close()

Remove VMDIRAC

VMDIRAC has been removed from the release.

  1. Stop agent(s) WorkloadManagement/CloudDirector and service(s) WorkloadManagement/VirtualMachineManager, either directly from the machine where they are running:
runsvctrl d /opt/dirac/startup/WorkloadManagement_CloudDirector
runsvctrl d /opt/dirac/startup/WorkloadManagement_WorkloadManagement

or better using the dirac-admin-sysadmin-cli CLI.

  1. Remove VirtualMachineDB
DROP DATABASE VirtualMachineDB;
  1. uninstall VirtualMachineManager and CloudDirector. Use the dirac-admin-sysadmin-cli CLI.
  2. remove any credentials you might have used from the server

Enable remote pilot logging

PR https://github.com/DIRACGrid/DIRAC/pull/6208 introduces a possibility to store pilot log files to remote storage. It is foreseen to use a plugin for this purpose. The PR contains a FileCacheLoggingPlugin which sends the logs to a SE.

  • Install WorkloadManagement/TornadoPilotLoggingHandler service
  • Install WorkloadManagement/PilotLoggingAgent agent

Configuration: Configuration is done in a VO by VO basis, in a VO-specific Pilot section in Operations. Defaults section can be used as usual to set up initial settings for all VOs.

  • Enable remote login in the Pilot section of a VO: RemoteLogging = True
  • Set the service URL: RemoteLoggerURL = https://dirac.host.name:8444/WorkloadManagement/TornadoPilotLogging
  • Set the upload SE, e.g.: UploadSE = UKI-LT2-IC-HEP-disk
  • Uploading is done by a Shifter called DataManager, so a shifter of this name should be defined in a shifter section of the VO
  • Set the upload path a VO can write to, e.g.: UploadPath = /gridpp/pilotlogs

The TornadoPilotLoggingHandler service requires a plugin name to be specified under Services/TornadoPilotLogging:

  • LoggingPlugin = FileCacheLoggingPlugin

ProductionsManagement

DB changes

Following changes in PR https://github.com/DIRACGrid/DIRAC/pull/7162

  1. Stop submission of productions, e.g. by stopping the ProductionManager service(s)
  2. Alter the following table:
use ProductionDB;
ALTER TABLE `Productions` ADD COLUMN `Author` VARCHAR(255) NOT NULL;
ALTER TABLE `Productions` MODIFY COLUMN `AuthorDN` VARCHAR(255) DEFAULT NULL;
  1. Restart the ProductionManager service(s)

RMS

DB changes

Following changes in PR https://github.com/DIRACGrid/DIRAC/pull/6419 and https://github.com/DIRACGrid/DIRAC/pull/6653

  1. Stop submission of requests, e.g. by stopping the ReqManager service(s)
  2. Alter the following table:
use ReqDB;
ALTER TABLE `Request` DROP COLUMN `DIRACSetup`;
ALTER TABLE `Request` ADD COLUMN `Owner` VARCHAR(255);
  1. Restart the ReqManager service(s)

WMS

Speed up the SiteDirector

DB changes

Following changes in PR https://github.com/DIRACGrid/DIRAC/pull/7110, TaskQueue are not part of the PilotAgentsDB anymore:

  1. Stop the Site Directors
  2. Alter the following table
use PilotAgentsDB;
ALTER TABLE `PilotAgents` DROP COLUMN `TaskQueueID`;
  1. Restart the Site Directors

Configuration

Now there exist different pilot "submission policies".

  • WaitingSupportedJobs (the default one): will only fill up some of the available slots, according to the number of waiting jobs. It does not take into account the number of already submitted pilots for these jobs (might result in some unused pilots but should not be a big issue, as it should target sites that are just occasionally used).
  • AggressiveFilling: will just fill up the available slots, no matter whether there are waiting jobs

To configure the submission policy, add the following option to the SiteDirector section of the CS:

SubmissionPolicy = <SubmissionPolicy name>
Clone this wiki locally