The exe returns the following error to the calling SSIS package: The process exit code was "2" while the expected was "0"

Jan 8, 2014 at 2:35 PM
Hello,

We are facing a weird problem on our Production and Dev servers with version 3 of ManagePartition,exe. There are about 17 child packages which are simultaneously called by a Master package. However, at a time only 14 out of the 17 packages are running. Each of these child packages has an execute command task which executes ManagePartition,exe with appropriate parameters. The Master package runs daily in the morning. It completes half the times without any problem and fails for the remaining half the number of executions. So, this looks like some kind of a concurrency problem.
The error we get from the child package is as follows:
The process exit code was "2" while the expected was "0"

I tried to run the master package in debug mode and through SQL agent on the day of the failure to mimic the problem, but it cant be reproduced.

Do we know when does the process exit code return a "2" and what does that code mean?
Any help regarding this would be greatly appreciated.

Thanks,
Ritesh B.
Coordinator
Jan 8, 2014 at 5:40 PM

Hi Rightesh

Exit code 2 is a generic return code that indicates some error during execution. The specific error message would be sent to console output where you could read it if running interactively. If you are running inside an SSIS package, you can capture the detailed error message to a text file by using the following in the Execute Task command:

Managepartition ... > errorfileN.txt

(replace errorfileN with a ***unique*** name per child package). The files will contain the specific error messages to aid debugging and show you the failure causes for the failing packages.

For future I'll want to capture these errors into windows event log..

Let me know what you find…

Jan 8, 2014 at 8:44 PM
Thank you so much Stuarto. I shall try implementing the output log file for each call to the exe and let you know what we find when it fails next.
Jan 23, 2014 at 5:49 PM
Hi Stuarto and Everyone else,

This morning our ETL failed on our Development environment with the following error in the SSIS package while running ManagePartition.exe
Error: The process exit code was "1" while the expected was "0".

However, this time we had the output of each such call to the exe redirected to a log file and the text from the log file is listed below.
Can you please help us out and provide us some directions on how we can fix this.

I have enabled trace in our Dev SQL Server for logging information about deadlocks by running the following statements
DBCC TRACEON (1204, -1)
DBCC TRACEON (1222, -1)
In addition, I have modified the startup parameters of the Dev SQL server to include –T1204 and –T1222, so that SQL Server will continue to log deadlock related information if its restarted.

Text from the log file for ManagePartition.exe:

ManagePartition.exe started on
Thu 01/23/2014
10:41:09.28
***** ManagePartition.exe LOG START *****
System.Data.SqlClient.SqlException: Transaction (Process ID 181) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj)
at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
at System.Data.SqlClient.SqlDataReader.HasMoreRows()
at System.Data.SqlClient.SqlDataReader.ReadInternal(Boolean setTimeout)
at Microsoft.SqlServer.Management.Smo.DataProvider.ReadInternal()
at Microsoft.SqlServer.Management.Smo.DataProvider.Read()
at Microsoft.SqlServer.Management.Smo.SqlSmoObject.InitObjectsFromEnumResults(Urn levelFilter, IDataReader reader, Boolean forScripting, List`1 urnList, Int32 startLeafIdx, Boolean skipServer)
at Microsoft.SqlServer.Management.Smo.SqlSmoObject.InitChildLevel(Urn levelFilter, ScriptingPreferences sp, Boolean forScripting)
at Microsoft.SqlServer.Management.Smo.SmoCollectionBase.InitializeChildCollection(Boolean refresh)
at Microsoft.SqlServer.Management.Smo.SmoCollectionBase.GetEnumerator()
at PartitionManagement.PartitionManager.CreateStgChecks()
at PartitionManagement.App.Run(AppArgs& parsedArgs)
Object reference not set to an instance of an object.
********** ManagePartition.exe FAILED **********
Failure happened on
Thu 01/23/2014

10:41 AM

Thanks,
Ritesh.
Apr 2, 2014 at 8:30 PM
Hi Stuarto and Everyone else,

The problem happened again this morning and the deadlock is occurring because one instance of the ManagePartition.exe is running an ALTER TABLE XXXXX CHECK CONSTRAINT statement and another instance of the exe is running a "SELECT XXX FROM sys.tables AS tbl INNER JOIN sys.check_constraints... " statement. I think this can be resolved if we have a version of the exe which will execute these select statements in a READ UNCOMMITTED mode. Or else a parameter can be added to the existing exe to provide the facility to select a READ UNCOMMITTED mode for all the select statements issued by the exe. Can we have such a facility.

Thanks,
Ritesh.
Apr 29, 2014 at 9:20 AM
Hi Stuarto,

can this issue be solved with the workaround suggested by Anthony in this thread?

https://sqlpartitionmgmt.codeplex.com/discussions/470467

Thank you.
May 14, 2014 at 3:23 PM
This problem is fixed now.
The problem was caused to a deadlock occurring when one instance of the exe was trying read from sys.tables and the other one was trying to alter the schema with an Alter table command.

--SQL from 1st instance of the exe
SELECT
cstr.name AS [Name]
FROM
sys.tables AS tbl
INNER JOIN sys.check_constraints AS cstr ON cstr.parent_object_id=tbl.object_id
WHERE
(tbl.name=@_msparam_0 and SCHEMA_NAME(tbl.schema_id)=@_msparam_1) ORDER BY
[Name] ASC

--SQL from 2nd instance of the exe
--Gets blocked by above query, these Aux tables are gone after they are loaded
ALTER TABLE [GenStg].[ReconciledTypeAuxCustID1] CHECK CONSTRAINT [chk_ReconciledTypeAuxCustID1_partition_1]

To solve this, we set the transaction isolation level to READ UNCOMMITTED for the connection object in the C# source code of the exe and then rebuilt the exe. We haven't experienced this error since then.
Jul 27, 2015 at 3:43 PM
Hi Everyone,

Looks we found a better way to fix this problem without any code change to the exe and by just changing a lock level attribute at the table level.
The problem used to occur while running multiple instances of the exe running. Some of them were getting deadlocked and were chosen as the victim causing the exe to give the message mentioned in the subject of this discussion.

The best way to fix it is to set the lock escalation level to AUTO for all the tables that might use this exe.
Here is the syntax on SQL Server to set it.
ALTER TABLE <schema_name>.<table_name> SET (LOCK_ESCALATION=AUTO)

Earlier, the lock escalation level for these tables was set to TABLE, which is the default value and results in more deadlocks.

You can create a script to set LOCK_ESCALATION=AUTO for all tables in a schema by running the below SQL. You can then run the script to set lock escalation level for all tables in that schema:
--ALTER TABLE schema.tablename SET (LOCK_ESCALATION=AUTO)
select 'ALTER TABLE [' + s.name + '].[' + t.name + '] SET (LOCK_ESCALATION=AUTO)'
from sys.tables t inner join sys.schemas s on t.schema_id = s.schema_id
where s.name = '<schema_name>' and lock_escalation_desc ='TABLE'

Thanks,
Ritesh Bhandarkar.
Aug 5, 2015 at 4:48 PM
Hi RItesh,

the LOCK_ESCALATION workaround doesn't work for me: I always get a deadlock error.
So I used the READ UNCOMMITTED transaction isolation leve, adding this method in the PartitionManager class (in the PartitionManagement.cs file):
public void SetTransactionIsolationLevelReaduncommitted()
{
    System.Collections.Specialized.StringCollection sc = new System.Collections.Specialized.StringCollection();

    string readUncomm = "SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED";
    sc.Add(readUncomm);

    scriptChunks.Add(sc);
    if (executeCommands) conn.ExecuteScalar(readUncomm);
}
and I called it in the App class, method Run, in this position:
using (scriptWriter)
{
    // Call appropriate Partition Manager constructor depending on whether a partition number or range value is provided
    if (partitionNumber != 0)
    {
        pm = new PartitionManagement.PartitionManager(sc, dbName, schName, partitionTblName,
           stagingTblName, partitionNumber, scriptWriter, executeCommands);
    }
    else
    {
        pm = new PartitionManagement.PartitionManager(sc, dbName, schName, partitionTblName,
           stagingTblName, partitionRangeValue, scriptWriter, executeCommands);
    }

    pm.SetTransactionIsolationLevelReaduncommitted();
    
    ...
and all seems work fine.
Aug 11, 2015 at 6:51 PM
,Hi lucazav

Thanks for the update.
Actually, I also started facing similar problems and used the "SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED" statement. So, I believe its the best solution for this problem.

Thanks,
Ritesh.
Nov 11, 2015 at 11:23 AM
Hi rightesh,

I forked the project and solved this issue here:

http://lucazav.github.io/SqlServerPartitionManagementUtility/

Hope to see you as a contributor there :)
Nov 11, 2015 at 4:49 PM
Hi Lucazav,
Got your messasge. Thanks for your contribution.
I believe in SQL Server 2016, truncation can be done directly at the partition level. So, looks like we wont need this exe after we move to SQL Server 2016, But that's still a long way to go. And even if we use SQL Server 2016, this exe will help in other cases to where we need to manage partitions.
Thanks,
Ritesh.