Fixed Bugs for IBM Platform LSF Version 9.1.3 Release Date: July 31 2014 The following bugs have been fixed in LSF Version 9.1.3 between 8 October 2013 and 21 July 2014: 223287 Date Description 2013-12-06 The preemption calculation was refined for shared resources to improve the preemption performance and throughput of the whole cluster. Component mbschd, schmod_preemption.so Platform Impact 223587 Date All Throughput of the cluster is diminished when it takes a long time to get small jobs preempted. 2013-10-16 The parameter MAX_EVENT_STREAM_SIZE cannot limit the size of the lsb.status file. Description After this fix, either the oldest lsb.stream.timestamp file or the oldest lsb.status.timestamp file will be deleted and then a new file will be written when the number for MAX_EVENT_STREAM_FILE_NUMBER is reached. Component mbatchd, liblsbstream.so Platform All The lsb.status file grows to a very large size and causes a storage space issue. Impact When the storage system does not work well, the LSF cluster may become unavailable. 223589 Date Description 2013-10-09 Duplicate data is logged in the lsb.status file which causes a PK violation in PA. Component mbatchd Platform All Impact 223671 Date Description Cluster Admin sees many PK violations on Platform Analytics side. 2013-11-01 When running a large number of short jobs (for example, sleep 3) on a Windows host, some jobs show as exited even though they ran successfully. Component res.exe Platform Windows Impact All jobs run successfully, but LSF reports that some small jobs have exited. 223799 Date 2014-05-04 To allow for enabling and disabling the sourcing of LSB_SUB_PARAM_FILE without causing an error, the internal parameter LSB_SUB_ADDITIONAL_REMOVAL_FROM_PARAMFILE (in lsf.conf) is used. When Description LSB_SUB_ADDITIONAL_REMOVAL_FROM_PARAMFILE is set as Y or y, LSB_SUB_ADDITIONAL should be removed from LSB_SUB_PARAM_FILE and exported as an environment variable. By default, LSB_SUB_ADDITIONAL is not removed and the setting of LSB_SUB_ADDITIONAL_REMOVAL_FROM_PARAMFILE is incorrectly regarded as N or n. Component bsub, mesub Platform All Impact A script using LSB_SUB_ADDITIONAL no longer works in LSF 9.1.1. 224016 Date 2013-10-20 POE will core dump if a user specifies LSB_PJL_TASK_GEOMETRY in the submission Description script for a job and the number of task groups in the task geometry is not equal to the number of execution hosts allocated by LSF. Component permapi.so Platform Linux Impact 224323 Date POE jobs run inside an LSF core dump. 2013-10-31 When using bsub -M to submit an exclusive job with a memory limit so large that it Description exceeds the execution host’s maximum physical memory, an out of memory (OOM) condition occurs. Component sbatchd Platform Linux\UNIX Impact A host becomes unavailable when the job exceeds the host’s memory. 224677 Date 2013-10-14 The XDR buffer used by the Master LIM to send resource information to the remote cluster is set to a fixed size. If the number of resources on a cluster is increased so Description that the resource information size is larger than the buffer, connections between the member clusters will break. In this fix, the buffer size is calculated dynamically based on resource configuration in a cluster. Component lim Platform All Impact Platform MultiCluster fails if too many resources are defined in lsf.shared. 224701 Date 2013-10-29 Jobs occasionally get stuck with the pending reason New job is waiting for Description scheduling. This may occur when trying to brun a job when the job is already scheduled. Component mbschd Platform Impact All Some jobs are stuck in a pending state until noticed by administrators or users complain to administrators. 224928 Date 2013-10-31 When cgroup accounting features are enabled (set by LSB_MEMLIMIT_ENFORCE=y, LSF_PROCESS_TRACKING=Y, or LSF_LINUX_CGROUP_ACCT=Y in lsf.conf), the Description job may be terminated by a memory limit (set by bsub -M or MEMLIMIT in lsb.queues). The memory usage accounting incorrectly shows that the job exceeds the memory limit, so the job is terminated. Component sbatchd Platform Impact 225236 Date Linux The job is terminated by MEMLIMIT unexpectedly when cgroup memory accounting is enabled. 2013-10-28 When no decay value is defined for the queue, a job submitted with “decay=0" will get Description rejected, even though this is a valid value. For example: bsub –R "rusage[mem=300:decay=0]" Component mbatchd Platform All Impact A job with a valid rusage string is rejected. 225401 Date 2013-12-20 mbatchd restarts, replays the events file, and core dumps if you use bsub to submit a Description new job and the job's resource requirement string (such as “order[ ]”) is longer than 511 characters. Component mbatchd Platform All Impact mbatchd core dumps and the LSF batch system is unavailable. 226154 Date 2013-11-20 If an interactive job is submitted that tries to pass the job's input data to bsub using a Description pipe ( | ), bsub will not forward the input data to the job correctly. For example, if the command used is echo "exit" | bsub -Ip tcsh -s, the "exit" string will not be passed to tcsh -s and the job will not exit. Component bsub Platform Linux/Solaris Impact Some interactive jobs do not run. 226391 Date Description 2013-11-15 The LSF/Clearcase integration utility daemons.wrap does not show the actual Clearcase view name (ccview) in the error log. Component daemons.wrap Platform All Impact Error messages in the log are not sufficient to diagnose problems. 226849 Date 2013-12-24 mbatchd hangs when there are duplicate job events with the same jobID in the Description lsb.events file. The duplicate events are caused by the failover, for example, when the network is operating erratically. Component mbatchd Platform All Impact mbatchd does not response and the LSF batch system is unavailable. 227047 Date 2013-12-20 When the nofile limit is set to unlimited, LSF daemons take a long time to start and the cluster behaves abnormally. Description With this fix applied, setting the nofile limit higher than 65535, an INFO level message will be logged for lim, res, mbatchd, and sbatchd stating: The nofile limit is in excess of 65535. This may cause performance issues with your cluster. Component lim, res, mbatchd, sbatchd Platform All Impact Cluster becomes unusable. 227222 Date 2013-12-22 After a socket error occurs on Windows hosts, subsequent jobs cannot run and the following error message appears in the child sbatchd log: sbdChild: starting mode=-s handles=664:600 Description rcvJobFile: chanRead_() failed. A socket operation has failed: Socket operation on non-socket. execJob: Job <4949999> <494999>failed in rcvjobfile_(), Unknown error. Component sbatchd.exe Platform Impact 227274 Date Windows All subsequent jobs remain pending on the host and sbatchd must be restarted to process the jobs. 2013-12-13 For jobs submitted using bsub –n, the bstop command does not work as designed. Description After using bstop, the bjobs output shows the job in SUSPEND status. However, the job is still running if checked by a system command. Component sbatchd Platform Impact All A job is not stopped with bstop, even though bjobs shows the job is in SUSPEND status. 227413 Date Description 2013-12-03 An application level checkpoint array element in a job array remains pending after a brestart command. Component mbatchd Platform All Impact Restarting a checkpoint array job fails and the job remains pending. 227571 Date 2013-12-05 If JOB_DEP_LAST_SUB (in lsb.params) is set to 0, there are warning messages Description indicating that the parameter will be ignored. JOB_DEP_LAST_SUB is set to 1 by default. Component mbatchd Platform All Impact JOB_DEP_LAST_SUB in lsb.params cannot be disabled. 228315 Date Description 2013-12-27 When QJOB_LIMIT is defined in lsb.queues, jobs submitted to the queue occasionally cannot be dispatched even if there are available slots. Component mbschd, schmod_limit.so Platform All Impact Some jobs in a queue cannot be scheduled even if there are slots available. 228346 Date Description 2013-12-30 When LSB_MIXED_PATH_ENABLE (in lsf.conf) is set to Y and bsub is used with a long command name, the job may exit with the incorrect status. Component sbatchd Platform All Impact Some jobs using long command names may exit with the incorrect status. 228349 Date 2013-12-20 During job cleanup, the effective user ID for sbatchd is changed to root. When the Description user’s home directory is mounted from the NFS server and configured without root privilege, the job post script will fail. Component sbatchd Platform Linux Impact .lsbatch/* files cannot be cleaned up. 228399 Date Description 2014-01-28 PREEMPT_JOBTYPE=BACKFILL does not work. When configuring a low priority queue with a job to preempt a backfill job, LSF does not allow the job (which may not have a run limit) to preempt the backfill job because it may delay the start of the job in the high priority queue. Component mbatchd Platform All Impact An administrator cannot set up a queue that would preempt a backfill queue under certain conditions. 228501 Date Description 2014-01-03 Windows hosts cannot execute esub.bat. Component bsub.exe Platform Windows Impact Cannot use esub.bat on Windows hosts. 228596 Date Description 2014-01-27 Messages for lsb.acct file rotation and deletion are not logged. Component mbatchd Platform All Impact No lsb.acct file rotation info message for users. 228863 Date Description 2014-01-27 In MultiCluster environments, bjobs -pac -l jobid does not show the PREDICTEDREMAINTIME column. Component Mbatchd, bjobs Platform Impact 228865 Date Description All The PAC Multi-cluster feature is missing the PREDICTEDREMAINTIME value in the GUI. 2014-01-22 When frequently using bhosts to check the host status, it causes the child mbatchd to perform a core dump. Component mbatchd Platform All Impact Child mbatchd core dumps and does not respond to b* query commands. 228919 Date Description 2014-03-27 Remote jobID information is unavailable in bjobs output. Therefore, Platform Application Center is unable to show the remote jobID of a job in a remote cluster. Component bjobs Platform All Impact 229142 Date Description Platform Application Center cannot show the remote jobID from bjobs output. 2014-01-22 Due to a script error, elim.gpfshost cannot be started. Therefore, local host GPFS information (gtotalin and gtotalout) cannot be collected and reported. Component elim.gpfshost Platform Linux Impact elim.gpfshost exits when loading host GPFS information. 229370 Date 2014-03-03 When starting dynamic hosts on a particular subnet they should join the cluster Description automatically and start taking jobs. LSF adds them to the correct host group but they never leave closed_LIM status in bhosts, even if all LSF daemons on them are restarted. Component lim Platform Impact 229403 Date Description All Dynamic batch execution hosts cannot be added to a cluster if the package size of the slave lim configuration that needs to send to the master lim is bigger than the MTU. 2014-03-18 No error is recorded in the lim log when lim does not have permission to open lsf.conf or lsf.cluster. Component lim Platform All Impact lim error message is confusing or absent, making it difficult to debug a problem. 229567 Date 2014-02-24 A job that is not configured with application level rerun, but is in a rerun queue (that is, Description a "rerunnable" job) will no longer be rerunnable after an Administrator runs badmin reconfig. Component mbatchd Platform Impact 229638 Date Description All Rerunnable jobs are no longer rerunnable after an Administrator runs badmin reconfig. 2014-03-05 When issuing the commands lsadmin resrestart/shutdown or lsadmin resdebug -o, esub is called unnecessarily. Component lsadmin Platform Impact 229721 Date Description All Running lsadmin resrestart/shutdown or lsadmin resdebug -o takes a longer time than necessary. 2014-02-14 LSF does not remove the file hostAffinityFile after the job is finished. Component sbatchd Platform Linux Impact The hostAffinityFile temp file must be deleted manually. 229819 Date Description 2014-03-10 When a Platform MPI job is running across nodes, the run limit is not handled correctly with signal SIGUSR2. Component blaunch Platform Linux Impact 229891 Date Description Platform MPI jobs are killed prematurely instead of a graceful shutdown. 2014-02-20 In a MultiCluster lease mode environment, submitting a job from an LSF 8.0.1 host to an LSF9.1.2 host causes an mbatchd core dump on LSF 9.1.2. Component mbatchd Platform all Impact Jobs cannot be submitted from LSF8.0.1 to LSF9.1.2 in MultiCluster lease mode. 229909 Date Description 2014-03-04 Occasionally, the mbschd log contains the error Cannot connect to the mbatchd:, even when all processes are functioning correctly. Component mbatchd Platform All Impact Error message gives the impression that the cluster is not functioning well. 229920 Date Description 2014-02-27 Value of the PER_PROJECT parameter (in lsb.resources) is limited to 59 characters. Component mbatchd Platform All Impact Value of PER_PROJECT is limited to 59 characters. 229971 Date Description 2014-03-13 mbatchd generates some duplicate records in the lsb.status file. Component mbatchd Platform Impact 230012 Date Description All The lsb.status file grows very quickly and impacts the PA reports due to a PK violation. 2014-02-20 When a job is run with both bkill and brequeue at the same time, a memory error occurs causing an mbatchd core dump. Component mbatchd Platform All Impact mbatchd core dumps and LSF batch system is unavailable. 230060 Date Description 2014-03-10 LSF randomly calculates an incorrect memory usage for MPI jobs. Component sbatchd Platform Linux Impact The correct memory usage for MPI jobs is not available. 230096 Date Description 2014-02-26 The value of PREDICTEDSTARTTIME in the output of bjobs -l -pac jobid contains a redundant <>. Component bjobs Platform Linux Impact Pending jobs cannot be viewed in PAC. 230113 Date 2014-03-11 Description Scheduler performance suffers after using bmod to modify a job group with a limit when there are many jobs in the group. Component mbschd, schmod_limit.so Platform All Impact Jobs cannot be dispatched 230305 Date Description 2014-03-10 Performance of scheduler with resource reservation suffers when there are many pending jobs. Component mbschd, schmod_reserve.so, schmod_parallel.so Platform All Impact Scheduler performance degradation 230389 Date Description 2014-03-14 If there are multiple *.swtag files under $LSF_TOP/properties/version, the multiple FUSERGRP and FXUSR entries cause patchinstall to fail. Component patchinstall Platform Linux\UNIX Impact Patch cannot be installed by patchinstall. 230428 Date Description 2014-03-03 Large memory leak with pim on Mac OS X 10.7. Component pim Platform Mac OS X Impact Jobs will pend since memory is not available 230570 Date Description 2014-3-17 When using the bswitch command to switch a running job to another queue, the JOBS limit defined in lsb.resources is bypassed. Component mbatchd Platform Impact 230578 Date Description Linux2.6-glibc2.3-x86_64 A running job in a low priority queue can be switched to a high priority queue and exceed the JOBS limit defined on the high priority queue. 2014-03-13 lsadmin ckconfig or reconfig does not show an error message when Begin ClusterAdmins is missing in the lsf.cluster file. Component lim, lsadmin Platform Impact 230580 Date All The cluster is unavailable after a restart and an error message indicating the problem is not available. 2014-04-11 Problems when using lsmake to build customized Android 4.3 code: (1) Building native Android 4.3 using lsmake is very slow. (2) Cannot build HTC customized Android code on two hosts using lsmake. (3) Slots are occupied by lsmake, leading to low slot usage. The following options have been added to the command lsmake to accommodate Description this feature: --max-cross-host-level <number> If lsmake enters <number> level, do not distribute the task to other hosts. The default value is a large integer. --no-block-shell-mode Perform "shell" tasks without blocking mode. Without this parameter, blocking mode is used. Component lsmake, lsmakerm Platform All Impact 230780 Date lsmake cannot be used and performs worse than gmake when building Android 4.3 code. 2014-03-19 When using the badmin reconfig command, the fatal error: Master host <host_name> is not defined in the Host section of the lsb.hosts Description file is reported if a short host name is defined in LSF_MASTER_LIST (in lsf.conf) and a long host name is defined in /etc/hosts and /etc/sysconfig/network. Component mbatchd Platform All Impact badmin reconfig quits with a fatal message. 230963 Date Description 2014-05-17 When the DNS is responding slowly, mbatchd responds to b* commands slowly as well, because it attempts to resolve host group names. Component mbatchd Platform All Impact mbatchd responds to b* commands slowly. 230979 Date Description 2014-05-24 The startup script hostsetup does not work on hosts with Mac OS X 10.8 and higher and LSF daemons do not start up after reboot. Component hostsetup Platform Impact 230985 Date Mac OS X LSF has to be restarted manually after every restart to configure the LSF startup script even though hostsetup has been run. 2014-03-11 Description Job information should be exported as environment variables to eexec, so that eexec does not have to issue bjobs or bhist to get job information. Component sbatchd Platform All Impact Poor job submission and execution performance with GOLD integration. 231078 Date Description 2014-03-13 mbatchd generates some duplicate records in the lsb.status file. Component mbatchd Platform Impact 231135 Date Description All lsb.status file grows very quickly and impact the PA report feature due to a PK violation. 2014-3-19 Using the bsub -cwd option or setting DEFAULT_JOB_CWD in lsb.params can unintentionally overwrite the value of the LS_SUBCWD environment variable. Component sbatchd Platform All Impact The original submission directory is lost in the execution environment. 231143 Date 2014-03-21 A child mbatchd core dump occurs when ENABLE_EVENT_STREAM is enabled, CONDENSE_PENDING_REASONS is set to ALL, and the system fails to write to the Description pendingreasons.<cluster_name> file for any reason (for example, there is no space left for the directory set by PENDING_REASONS_TMP_DIR or the file has been deleted). Component mbatchd Platform All Impact 231286 Date Child mbatchd core dump occurs but there is no observable impact to the LSF cluster. 2014-03-25 The mbatchd log includes the message RB_rusageUpdate() Scheduler Description doesn't scheduled job xx@xxxx on host xxx, but SBD reports job usage info from that host when resized jobs are run and hosts are released on completion. Component mbatchd Platform All Impact Confusing warning message. 231357 Date 2014-03-31 A job is put into a pending state with the pending reason Job has a specified Description start time if both the -b and -R requirements have not been met. However, the job's pending reason will be changed if there is another job pending due to the same -R option. Component mbschd Platform All Impact The job pending reason is incorrect. 231589 Date Description 2014-04-20 An mbatchd core dump occurs when upgrading LSF from version 9.1.1 to 9.1.2, if sbatchd is restarted first, then mbatchd is restarted. Component mbatchd, sbatchd Platform All Impact mbatchd core dump occurs and LSF batch system is unavailable. 231611 Date Description 2014-4-2 The operator || does not work for shared resources in rusage with parallel jobs even if there are free resources available for a sibling resource requirement. Component mbschd, schmod_parallel.so Platform Impact 232162 Date Description All Parallel jobs with a sibling resource requirement cannot be dispatched even if there are enough resources in the cluster. 2014-04-04 A job submitted with a file limit is killed if the information is sent to the res log but it has already reached the file limit. Component res Platform All Impact Jobs submitted to LSF cluster are killed incorrectly. 232240 Date 2014-04-04 When LSB_KRB_TGT_FWD=Y and LSB_AFS_JOB_SUPPORT=Y are configured in Description lsf.conf, res causes an FD leak which eventually means there are no FDs available. Component res Platform Linux\UNIX Impact LSF jobs fail to renew AFS tokens. 232315 Date Description 2014-04-14 The error System error 109 has occurred is returned when using the net stop command on Windows to stop res or sbatchd. Component res, lim Platform Windows Impact Scripts cannot be used to manage daemons. 232334 Date Description 2014-04-13 An sbatchd core dump occurs when AIX is upgraded to the TL and service pack 710003-01-1341 and jobs are submitted. Component sbatchd Platform AIX Impact sbatchd core dumps and LSF reports the job is exited. 232736 Date Description 2014-04-30 Using the LSF API to read a streaming file causes a memory leak. Component liblsbstream.so Platform ALL Impact A memory leak causes the application’s memory usage to grow exponentially. 232824 Date Description 2014-04-22 A resource requirement is not changed by bmod after a checkpoint job is restarted. Component sbatchd Platform All Impact Checkpoint jobs restart with a wrong RES_REQ. 233074 Date 2014-04-20 Description In a mixed MultiCluster, forward environment with MultiCluster lease mode, mbatchd core dumps when it restarts. Component mbatchd Platform All Impact mbatchd core dumps and LSF batch system is not available. 233234 Date 2014-04-30 When DJOB_ENV_SCRIPT (in lsb.queues) is configured and the Description openmpi_rankfile.sh script is used, the file created by the script cannot be accessed. The openmpi_rankfile.sh script is missing an environment. Component res openmpi_rankfile.sh Platform Linux\UNIX Impact The blaunch/openMPI integration is not complete for the CPU binding. 233424 Date 2014-04-22 Before a job is finished, bpeek fails to display the job output and generates an Description unclear error message: ls_rstat: File operation failed: No such file or directory. Component bpeek Platform All Impact An unclear error message is generated by bpeek. 233455 Date Description 2014-04-29 elim.mic contains a hard coded library path to libmicmgmt.so. If libmicmgmt.so is not installed in the default directory, elim.mic exits. Component elim.mic Platform Linux Impact 233534 Date Description elim.mic exits and no GUP information is collected. 2014-07-03 If sbatchd is not responding or is unavailable when mbatchd attempts to send modification to it, sbatchd will never receive the modification information. Component mbschd, sbatchd Platform Impact 233565 Date All Changing a running job’s run time limit with bmod does not take effect when sbatchd is unavailable. 2014-04-24 Loading the Kerberos library fails and causes sbatchd to core dump when running Description jobs if LSB_KRB_TGT_FWD=Y is set but LSB_KRB_TGT_DIR is not configured in lsf.conf. Component sbatchd Platform Linux\UNIX Impact LSF jobs fail to renew AFS tokens. 233610 Date Description 2014-04-25 After migration to a new host, a job array element remains in a pending state. Component mbatchd Platform All Impact A long-running checkpoint job fails to start in LSF after migration to a new host. 234252 Date Description 2014-05-08 An mbatchd core dump occurs, caused by a memory overflow, when using bsub -f with %J and the file path is larger than 256 characters. Component mbatchd Platform All Impact mbatchd core dumps and LSF batch system is not available. 234588 Date Description 2014-05-11 When running patchinstall outside the installation directory, patchinstall fails with the error Error: Unable to access jar file lap/LAPApp.jar. Component patchinstall Platform Impact 234858 Date Linux/Unix patchinstall fails to install patches when it is issued outside the installation directory. 2014-05-23 LSF does not recognize a user group and gives a warning message when running Description badmin reconfig if a user group cannot be searched by LDAP due to an LDAP size limitation. Component mbatchd Platform Linux\UNIX Impact User group is not recognized by LSF. 235115 Date Description 2014-06-18 Problem with guarantee and preemption features. High priority, guaranteed consumer jobs remain pending even when resource requirements are met. Component mbschd, schmod_default.so Platform All Impact Pending jobs do not run even if resource requirements are ment. 235149 Date Description 2014-6-10 In a MultiCluster environment, the error message MCB_encodeMcbMsg: xdr_func() failed is frequently found incorrectly placed in the mbatchd log. Component mbatchd Platform All Impact The execution cluster cannot send job usage information to the submission cluster. 235662 Date 2014-05-29 If Intel MPI is installed in a directory other than the default /opt directory, Intel MPI Description jobs using the PAM/TASK starter framework cannot run unless the MPI_TOPDIR is manually changed in the Intel MPI wrapper. Component intelmpi_wrapper Platform Impact 235681 Date Linux The intelmpi_wrapper must be modified manually if the Intel MPI location is not the default directory. 2014-05-27 An XDR error message MCB_channelJobDecsnToRemote: Description MCB_sendDispDecsnToCluster() failed occurs in mbatchd, and the job cannot be forwarded to a remote cluster. Component mbatchd Platform All Impact Jobs are not forwarding to the remote cluster in MultiCluster lease mode. 235696 Date Description 2014-6-16 When checkpoint jobs are submitted by a script, the command brestart -W does not work and restarted jobs cannot be terminated by RUNLIMIT. Component sbatchd, erestart Platform Impact 235889 Date Description Linux After running brestart, the job cannot exit and keeps being "Checkpoint initiated" and "Checkpoint succeeded" iteratively. 2014-06-04 When executing a job containing multiple tasks, task RES calculates an incorrect XDR size and causes an XDR encoding error. Component res Platform All Impact Jobs that are launched by blaunch fail to execute. 235891 Date Description 2014-05-30 In MultiCluster lease mode, an uninitialized variable sometimes causes an mbschd core dump. Component mbschd schmod_default.so Platform All Impact Job dispatch fails. 235920 Date Description 2014-05-30 The RES log contains an incorrect spelling for "acquire". Component res, mbschd Platform All Impact Typo in an error message of the RES log file. 235925 Date 2014-06-17 Description The command combination bkill -r -J <job_name> does not work as expected. Component bkill Platform All Impact bkill -r -J <job_name> does not kill jobs as expected. 236105 Date Description 2014-06-10 In MultiCluster lease mode, a buffer overflow occurs when the lease.state.file is too large, causing an mbatchd core dump. Component mbatchd Platform All Impact mbatchd core dumps and the LSF batch system is not available. 236187 Date Description 2014-7-2 When the first line of a bsub job script is larger than 16361 characters a core dump occurs. Component bsub Platform Linux2.6-glibc2.3-x86_64 Impact bsub core dumps and the job is not submitted. 236477 Date Description 2014-06-17 When mbschd encounters an error on a job, other jobs do not get scheduled and remain stuck. Component mbschd Platform All Impact Jobs are pending until badmin reconfig is issued. 236606 Date Description 2014-06-11 In MultiCluster forward mode, a parallel job submitted with the same RES_REQ is blocked. Component mbschd, schmod_mc.so Platform All Impact Jobs are stuck in pending status on submission cluster. 236614 Date 2014-06-11 When the lsb_submit() API is used to submit jobs for both a parent and child Description process, jobs submitted for the child process do not process LSB_SUB_MODIFY_FILE and LSB_SUB_MODIFY_ENVFILE properly. Component liblsf.a, liblsb.so, libbat.a, libbat.so, lsbatch.h, lsf.h Platform All Impact esub does not work with the lsb_submit() API. 236705 Date 2014-07-02 When a shared resource is configured for the cluster, vemkd reports a warning Description message: lsfinit: resource <resource_name> is being used by multiple hosts. It cannot be used in a resource requirement expression. Component vemkd Platform All Impact Error messages in the vemkd log file with even with correct configuration. 236752 Date 2014-07-01 When preemption and guaranteed SLA are both enabled, mbschd will take a long Description time to finish one scheduling cycle, especially when there are tens of thousands of pending jobs. Component mbschd Platform All Impact mbschd performance issue causes low job throughput of the LSF cluster. 237058 Date Description 2014-06-24 A core dump occurs when using bsub to submit a job, and the specified command and its arguments contain multiple quotations. Component bsub Platform All Impact Some jobs cannot be submitted. 237460 Date 2014-07-14 After enabling LSB_QUERY_ENH (in lsf.conf), if there are many bhosts requests Description and there is an affinity host in the cluster or affinity is enabled in the cluster, the query child mbatchd will core dump repeatedly. The core dump is caused by a "thread unsafe function" Component mbatchd Platform All Impact Child mbatchd core dumps when b* query commands do not work. 237644 Date Description 2014-07-03 When using the brestart command with GOLD Integration jobs, the command fails due to some missing job information such as a project name or jobID. Component brestart Platform Linux Impact GOLD integration does not work well with brestart. 237853 Date 2014-07-21 When a long pre-execution job is run, but the job is killed before the pre-execution Description portion is finished, the environment variable LSF_JOB_EXECUSER cannot be retrieved with eexec. Component sbatchd Platform Linux Impact Many GOLD reservations will not be released, if gcharge fails and a job is killed. 237867 Date Description 2014-07-02 Occasionally esub does not work when LSF_TMPDIR is set to a shared file system because the temp file for one job is overriden by another job. Component bsub Platform All Impact Incorrect or missing job submission options are set in esub. 237955 Date Description 2014-06-27 mbatchd exits when the license error Unable to contact LIM occurs. Component mbatchd Platform All Impact Error message is confusing. 238419 Date 2014-07-03 Description A job that is running on another host occasionally fails when the bpeek command is used in a mixed cluster environment. Component bpeek Platform All Impact bpeek occasionally does not work. 238881 Date Description 2014-07-21 For exclusive jobs, the reserved job cannot be backfilled. Component mbschd, schmod_reserve.so Platform All Impact Short jobs cannot use backfill slots. Copyright and trademark information © Copyright IBM Corporation 2014 U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. IBM®, the IBM logo and ibm.com® are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at www.ibm.com/legal/copytrade.shtml.