#cherylsbirthday

Posted by thomaswmarshall

This past week the hash tags were all chirping about how challenging this logic puzzle was. It seems that Singaporean middle school students are posed with a question that challenges their deductive reasoning as well as their ability to remain objective even when the posing of the question invites them to assume a subjective position within the problem. Technically, it is a contest question for the Singaporean and Asian School Math Olympiad.

You the solver are shown that one of two characters in the problem possesses enough information that they can deduce that the other does not in fact possess enough information to deduce the solution. This fact when announced provides the second character with enough information that he claims to now be able to deduce the answer. That proclamation of deducibility incites the first character to then declare that he too has deduced the answer.

The problem goes something like this:

Cheryl gives a list of dates that could be her birthday to Albert and Bernard. She tells Albert the month and Bernard the day. Albert declares that he does not know the date but that he does know that Bernard does not. Bernard declares that he didn’t know the date but now does. Albert then declare that he too knows. So what is Cheryl’s birthday? Since SQL is my hammer let’s see if this is a nail.

For starters, we have a short list of dates so let’s put them into a table variable.

Next, we can select those dates not in months with days that are unique to a month. We do this because if it were a day unique to the month it would be possible for Bernard to determine which. Albert has decided is impossible (we are assuming these characters are not wrong in their deductions).

After that we select those dates not sharing a day. This is because if it were one, it would be impossible to have deduced the answer knowing only the day and now that it is not in a month with a unique day. It was this second piece of information that enabled Bertrand to deduce the date already knowing the first.

What remains are three dates only one of which is unique to the month, which is a requirement to deduce it knowing only the month and that solution can be deduced knowing the day and that it is not in a month with a unique day. Gosh, Albert and Bernard are smart.

DECLARE @candidates TABLE (rowid INT IDENTITY(1,1), candiDATE DATE);

INSERT INTO @candidates VALUES

(‘May 15 2015’),(‘May 16 2015’),(‘May 19 2015’),

(‘June 17 2015’),(‘June 18 2015’),

(‘July 14 2015’),(‘July 16 2015’),

(‘August 14 2015’),(‘August 15 2015’),(‘August 17 2015’);

WITH COUNTS AS (

SELECT candiDATE, DATEPART(DAY, candiDATE) [Day], DATEPART(MONTH, candiDATE) [Month],

COUNT(DATEPART(DAY, candiDATE))

OVER (PARTITION BY DATEPART(DAY, candiDATE)) [IfItWereAMonthWithAUnqiueDayBernardCouldKnow]

FROM @candidates

) , NEXTCOUNT AS (SELECT candiDATE,

DATEPART(DAY, candiDATE) [Day],

DATEPART(MONTH, candiDATE) [Month],

COUNT(DATEPART(DAY, candiDATE))

OVER (PARTITION BY DATEPART(DAY, candiDATE)) [IfItWasNotAUniqueDayRemainingBernardWouldNotKnow]

FROM @candidates WHERE DATEPART(MONTH, candiDATE) NOT IN (SELECT [Month] FROM COUNTS WHERE [IfItWereAMonthWithAUnqiueDayBernardCouldKnow]=1)

)

SELECT candiDATE, COUNT(DATEPART(DAY, candiDATE))

OVER (PARTITION BY DATEPART(MONTH, candiDATE)) [IfBernardDidnotKnowItWouldnotBeUniqueTotheMonth]

FROM @candidates WHERE candiDATE IN

(SELECT candiDATE FROM NEXTCOUNT n WHERE n.[IfItWasNotAUniqueDayRemainingBernardWouldNotKnow]=1)

Which came first? The Select or the Into?

Posted by thomaswmarshall

This morning while discussing things vaguely reminiscent of work with a colleague of mine the topic of Russell’s paradox was broached. Russell’s paradox relies the concept of a set of sets that do no contain themselves as members. It is a paradox because if that set does not contain itself it satisfies the requirement of membership and then must therefore contain itself, which would be a violation of the rules required for membership. Long story short humanity innovated and a guy named Ernst Zermelo contributed some rules that don’t allow such paradoxes. Zermelo actually discovered the paradox first but didn’t publish. That was confirmed a guy named David Hilbert who drew a really great looking space filling curve with a few simple steps repeated infinitely. The talk of Russell’s paradox prompted me to write the following SQL:

SELECT * INTO #temp FROM tempdb.sys.tables;

SELECT * FROM #temp;

I then posed the question: Will #temp appear as a member of the rows returned from #temp?

You can see an instance of #temp does in fact appear. The suffix appended to it uniquefies the name. This is what allows for multiple temp tables of the same name to be created. We can see this by running the above query in another tab (which opens another session).

The reason that we don’t see the table excluded from the results set is that in order for the temp table to hold the data from the sys.tables catalog view it must exist. And if it exists it will appear in the catalog. This helps to illustrate that SQL is not (always) a procedural language, or that INTO does not logically mean CREATE and INSERT.

SQL is usually typified of a declarative programming language. Being a declarative programming language means that we are not telling the database how to do what we want, only what we want. In this case we told it we want a table that matches the definition of sys.tables and we want it filled with the contents of sys.tables. Contrast this with the having asked SQL to take the contents of sys.tables and put it into a table that matches the definition of sys.tables. In the first case we have to first make the table which puts it into sys.tables, in the second we take the contents of sys.tables and then put it into our newly created table.

This might become clearer if we examine a query plan.

We can clearly see from the plan that sys.tables is a view that draws from the inaccessible mssqlsystemresource system database. It might look like all of those branches contribute to building the table to insert into, but that is just where the data comes from. A simpler select into plan can be seen if we select from a table valued function (TVF).

Here we can see that the Table Insert showplan operator occurs before the Select Into operator. That doesn’t actually create the table, though. The table is created by a Data Definition Language (DDL) operation. DDL operations don’t produce showplan operators. So there you have it, moderately conclusive proof that the table does exist before the read of data to insert occurs, and some vague syntactic hints at why.

Did you check the registry settings

Posted by thomaswmarshall

Earlier tonight I was faced with a corrupted installation of SQL Server. This is perhaps a consequence of my having installed a default instance of SQL Server 2014 after uninstalling an instance of SQL Server 2005. The Engine and SQL Agent would not start. To correct this I had to set the configuration for the engine’s service account to start the engine in SQL Configuration Manager. I then had to add the SQL Agent service account to the login and grant it required permissions.

Self Driving Manual

Posted by thomaswmarshall

Kevin was your normal everyday car buyer. He wasn’t an earlier adopter or gadget obsessed. He worked with technology but not in it. As he sat there in the driver’s seat of ALR-300 listening to beep after beep staring at the flashing light, he couldn’t help but wonder if this was some sort of malicious prank. At least he wasn’t late for an important meeting.

He had bought the ALR-300 so he could just sit and ride. Self driving cars were everywhere. There was another one zipping along, not stopped on the shoulder beeping. He wished he had signed up for the remote care and technology package so he could ask the car why it was beeping. Staring at the dashboard and that blinking light thinking “if only it would…”, he decided to call for help. After a few minutes of checking his email and posting to Facebook about how horrible his day was starting out, he found a tow truck service and called.

The longest 43 minutes dragged by and at least a dozen ALR-300’s zipped past. How many 400’s and 500’s he’d lost count. He was almost regretting buying a self driving car. At least with a manual he could drive it himself. Ah, but how many hours would he have spent staring at break lights instead of his tablet.

The tow truck drove him and his vehicle to the nearest dealership. He explained the problem to the receptionist. She made a face and walked off to get the service manager. He could see them both talking about him. The service manager looked concerned. He picked up the phone.

After a brief conversation he put the phone down and came out of his office. “Kevin, thank you for coming here today. I had to double check with the manufacturer but I am confident that you can drive home. You see the beeping red light that says drive indicates that you need to drive.”

What if…

Posted by thomaswmarshall

What if we were to implement an index structure that relied on Cunningham chains for large data sets and contained leaf nodes that were in turn B+ trees?

My Kickstarter is Launched

Posted by thomaswmarshall

I want to share with you my Kickstarter. The Crash Surveillance Drone is a UAV deployed from a vehicle by the deployment of an air bag. It can alert the authorities, get real time imagery, and signal other drivers of the accident. From multi-car pileups to people struck exiting their vehicles in an accident, I truly believe that my Kickstarter will lead to many saved lives. Please, share this with everyone you know.

https://www.kickstarter.com/projects/358359447/crash-surveillance-drone-csd

And Recursion(…)

Posted by thomaswmarshall

In my last post I discussed the topic of modularity of code in the context of generating prime numbers. In this post I am going to discuss the topic of recursion in SQL server in the context of modular code subsequently in the context of generating prime numbers.

In the prior post we showed how modularizing would allow us more opportunity to improve and correct past work without redoing everything from scratch. We created a function for testing primality called intIsPrime. We call it intIsPrime because if we really want a lot of primes (as many as we can get from SQL Server without inventing a new data type or doing some column and constraint tricks or both) we would need a bigintIsPrime function for all of those primes bigger than the integer data type which ends with the 2,147,483,647 which is prime. 2,147,483,647 is in fact the 3rd double Mersenne prime.

We then went on to query for the primes in the next 100 or so numbers using a technique called a recursive common table expression (CTE). In the spirit of modularity and reaching beyond a mere hundred tests through modular design consider this function:

CREATE FUNCTION dbo.hundredMorePrimeTests(@n int)
RETURNS TABLE
AS RETURN
(

WITH recurseMore AS (
SELECT @n AS p, 1 AS n 
UNION ALL 
SELECT p + 1 , n + 1 FROM recurseMore
	WHERE n < 101
					)
SELECT p, n FROM recurseMore
WHERE dbo.intIsPrime(p) = 1
)

By putting our recursive CTE into a function. We can pass it the parameter we want to test the next hundred integers for primality. This allows us to call it in a stored procedure that bases the call on the last prime in our prime table. We can also have that stored procedure call its self (another type of recursive technique) which enables us to run the stored procedure once and have it execute 32 times (an error occurs is a stored procedure nests more than 32 levels).

CREATE PROCEDURE thirtyTwoHundredPrimalityTests 
	@c INT = 1
AS
BEGIN
	SET NOCOUNT ON;
DECLARE @n INT; 
SELECT @n = MAX(p) FROM dbo.primes;
INSERT INTO dbo.primes 
	SELECT p FROM dbo.hundredMorePrimeTests(@n);
SET @n = @c + 1;
IF @n < 32 
EXEC thirtyTwoHundredPrimalityTests @n ;    
	
END

There you have it, modular and recursive. If we wanted to we could call thirtyTwoHundredPrimalityTests from another stored procedure that could serve to limit the run based on how long we want to run, or how many integers we want to test. We can pass @c as any integer and the result would be that many 100s of integers tested.

Generating More Prime Numbers

Posted by thomaswmarshall

In a previous post I wrote about this new (I think) set of prime numbers and how I generated them with T-SQL. Now, I want to discuss a more philosophical and practical issue. No, I am not going to split hairs about how the prime numbers were there already, like all the integers, and we are just testing them for primality. I also don’t want to discuss why 1 is or is not a prime number (I am not a mathematician). I am actually looking to discuss modularity and code reuse.

Take a look at this function:

DROP FUNCTION intIsPrime;
GO
CREATE FUNCTION intIsPrime ( @n INT)
RETURNS BIT
AS
BEGIN
DECLARE @return BIT; SET @return = 1;
IF @n IN (1,2,3) GOTO finished;
IF @n % 2 = 0 
OR
   @n % 3 = 0
OR 
   FLOOR(SQRT(@n)) = SQRT(@n) 
	--I can't be the first person to note that an integer squareroot means non-prime.
SET @return = 0; GOTO finished;

DECLARE @k INT; SET @k = 5; --By starting at five we are starting at 6k-1
	WHILE @k <= FLOOR(SQRT(@n))
	BEGIN
		IF @n % @k = 0
		OR @n % (@k + 2) = 0 --which makes this 6k+1 
	SET @return = 0; GOTO finished;
	SET @k = @k + 6 ;
	END

finished:
RETURN @return
END

By encapsulating the prime test into a function we separate the operation of testing primality from that of generating integers, as well as from inserting into the table. Why is this advantageous? Consider a scenario where we have a table that we filled with prime numbers using an algorithm that we believed was the most efficient method for testing primality. Later we realized that this super efficient algorithm used by “famous” mathematicians actually only determined that number were likely prime. Uh, oh. Chances are that we were using some sort of special built prime detecting super-computer and didn’t generate nearly enough numbers to have a non-prime that was merely likely. That is still a question of chance rather than certainty.

So now if we were to modify the script from our previous post we would have to get rid of all of numbers and generate new ones. If we had implemented the algorithm for testing primality as above (a better algorithm than before which serves to highlight the advantages of modularity) as a function we can now use the improved algorithm to test those numbers already in the table. We can also use the more efficient approach of selecting the numbers where the prime test fails rather than iterating over all of them. If we wanted to put in new primes we can select from a recursive CTE those integers that pass the primality test rather than incrementing through each number.

WITH recurseMore AS (
SELECT max(p) AS p, 1 AS n FROM primes
UNION ALL 
SELECT p + 1, n + 1 FROM recurseMore
	WHERE n < 101
					)
SELECT p FROM recurseMore
WHERE dbo.intIsPrime(p) = 1

Using the recursive CTE (which is more efficient, but that’s a topic for a different post) further serves to highlight the value of modularity. Note that we are restricting the recursive portion of the CTE to n < 101. This is because a CTE can only recurse 100 levels (+1 for the anchor row). This limits us to selecting primes that are within 100 of our max prime. We can work around this by encapsulating the CTE in a stored procedure or table valued function that accepts the max prime as a parameter. Technically it would accept an integer and generate a list of primes less than or equal to that integer + 100. By calling this in a wrapper we can test the return of that function inserting new primes, and increasing the input parameter. If we get no primes for a group of 100 (again I am not a mathematician so don’t make fun if its impossible to go a hundred integers without a prime), we can just pass the next number above it and carry on.

Better Highway Safety Through Technology

Posted by thomaswmarshall

Preview the Crash Safety Drone today.

Generating Recursive Safe Prime Numbers Using T-SQL

Posted by thomaswmarshall

I know what you are thinking: What is a recursive prime number? Why would I want to generate a list of them?
Ok, never mind the second question, how do I generate recursive safe prime numbers? I presume I will learn to generate primes along the way.

The first question is easy to answer… sort of. The short answer is: A recursive safe prime number is a member of a set of prime numbers I discovered (invented/defined). As far as I know it is a set of prime numbers I was the first to define. Essentially it is a safe prime number (2p+1) such that the Sophie Germain number (p) is also a safe prime. In other words it is a prime number such that the middle number has a number on either side that are prime such that the middle number has a number on either that are prime such that recursive etc. (e.g. 2(2(2(2(2p+1)+1)+1)+1)+1 will be a recursive safe prime if p is a safe prime and the result is also prime).

The second question is not as easy to answer but I’ll try.

1. You want to show off what Tom showed you in hopes that you might share in that remote possibility that he’d win a Fields medal for it. The clock is ticking on my eligibility.

2. You want to improve your understanding of safe primes and Sophie Germain primes supportive of your understanding of strong primes in RSA standards and figure that T-SQL might be easier to read than C or any of the cryptic diagrams you find in cryptography texts.

3. You want to.

The third question is the reason I wrote this post. And yes we will generate primes. First we will need a place to put them. Since we are using T-SQL a table seems a reasonable enough place to put them.

CREATE TABLE [dbo].[primes](
	[n] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
	[p] [int] NOT NULL
)

Next we will need to generate some primes to put in our table. I use a simple counter with a trial division to test primality and insert those that don’t fail. I did some trial and error to get @n. It runs in under 10 minutes on my laptop. Your mileage may vary.

DECLARE @n INT; SET @n = POWER(2,20)-1;
DECLARE @m INT; SET @m = 1;
DECLARE @k INT;
WHILE @m <= @n
BEGIN 	 
SET @k = FLOOR(SQRT(@m)); 	
     WHILE @k > 1 
	BEGIN
	IF @m % @k = 0 GOTO NotPrime;	
	SET @k = @k - 1;
	END

PRINT @m
INSERT INTO [prime].[dbo].[primes]
           ([p])
     VALUES
           (@m)
NotPrime:
SET @m = @m + 1;
END

As of the writing of this post the largest prime in my table is 1048343. If we set @n in the above script to 1048344 we can run it again to get more prime numbers. Unfortunately, the largeR the numbers we test the larger their square roots and subsequently the greater the number of modulus tests performed in the inner loop. That means each successive prime we find will take longer than the last. Fortunately, finding the subset of primes we are looking for is a selection from a table.

WITH sophie_germain AS (
SELECT p.[n]
      ,p.[p] sophie_germain
	  ,sp.[p] safe
  FROM [prime].[dbo].[primes] p
INNER JOIN [prime].[dbo].[primes] sp
	ON (2 * p.[p]) + 1 = sp.[p]
	--AND p.[p] > 1
)
SELECT sg.n, sg.sophie_germain, sg.safe
, rs.safe recursive_safe
	FROM sophie_germain sg
		LEFT JOIN
		sophie_germain rs
		ON (2 * sg.safe + 1) = rs.safe
ORDER BY n

There would also be a subset of subsets implied such that they are recursive primes with a recursive prime for their Sophie Germain number. And another set would then be implied as is the nature of recursion. These I would call these double recursive primes, triple recursive primes, n-tuple recursive primes.

Thomas [W] Marshall

Database and Business Intelligence

#cherylsbirthday

Which came first? The Select or the Into?

Did you check the registry settings

Self Driving Manual

What if…

My Kickstarter is Launched

And Recursion(…)

Generating More Prime Numbers

Better Highway Safety Through Technology

Generating Recursive Safe Prime Numbers Using T-SQL