Transcript of the interview with Jonathan Lewis

If you missed the interview with Jonathan Lewis or don’t want to spend 93 minutes watching it, here is its transcript. Thanks to Stanislav Osekin from Innova, who did most of the hard work. I have only corrected special Oracle terms. Also thanks to Jonathan, who helped me in several complex cases where I couldn’t understand the wording.

T: Hello and welcome! My name is Timur Akhmadeev and I’m sitting here at the Innova office with Jonathan Lewis. Hello, Jonathan!

J: Morning, Timur! Nice to be here!

T: Nice to see you too! Thank you for coming!

J: My pleasure!

T: We are going to record an interview and I’m going to ask you some questions related to Oracle, to your work and your history. I would like to start with the questions about user groups. There are a lot of user groups in the world. UKOUG is one of the biggest, one of the (most) famous. How do you describe the main purpose of the user groups?

J: Well, as you say, the UK user group is very large, very well-known and I have been involved in the user group for, I think, nearly 20 years in various capacities. I was a director for a little while on the board and so forth. I’ve contributed a great deal to the user group. I think from my perspective the most important thing about the user group idea is that individuals, who use the software, who have to deal with the software, who do their job everyday with the software, get an opportunity to talk to other people. And often it’s the conversation when you actually see someone eye to eye, when you start talking about what you are doing and where your problems are. You can start a conversation where you can all learn perhaps how to do something better or find a way of actually solving the problem. Because you’ve got time to present an idea, have someone argue about it, make a counter suggestion, make a suggestion to improve things and so on. And so despite the fact that you’ve got all things like the OTN Database Forums, you’ve got other forums, you’ve got My Oracle Support (I’ve nearly said Metalink, you know), you’ve got all these things, you got all these bloggers who can supply you with the information you can read, you’ve got these places where you can post a question and get five, ten or twenty different people giving you bits of answers. But of course if you post a question it’s quite hard to have a dialogue with someone. And they can’t really ask you “Well, tell me a little bit more about this, this and this”, because it’s quite hard to type enough information to give them the right idea. So there is still plenty of scope for actually meeting someone because in the space of a five minute conversation you can exchange ten times as much information as you could by going online and typing something perhaps into a blog or typing a question in the Technet. So it’s the personal physical face-to-face communication which I think is a tremendous benefit for the user group. There are others, of course, but that’s the very first one, I think, that really, really matters.

T: Do you enjoy working and participating in the UKOUG and what is your view on your work in this user group?

J: I’ve always enjoyed the time I spend with the user group. Mostly what I’ve done with the user group is just show up for the main conferences. The UK user group particularly has perhaps as many as a hundred little events through the course of the year – a huge number of events! They are all broken down by special interests, so there’s a group which is basically database server group and they have three meetings per year and perhaps something between thirty and a hundred people, which show up for the each meeting. Then there will be a group, which is HEUG, Oracle’s Higher Education user group, and they might have two or three meetings, but perhaps there maybe ten people to show up. And they perhaps will show up in User Oracle office and they’ll just be sitting in one of the Oracle offices in London or Birmingham. We have a very large number of specialized subgroups which have a few meetings per year. And I tend to focus particularly on the group which is, perhaps, the database server. So, I did a lot of presentations. That was my, I think, biggest single contribution to the UK user group. I just loved going and standing up and saying “Look! This is what Oracle can do” and all explaining how to solve problems and things like this. And then of course it’s just quite fun having people to talk to afterwards and the general chatter that goes on at these events. And obviously I was also involved with the board of directors for some time. It’s now changed the structure a bit. I was on a council for some time. And, of course, there is also a lot of hard work there: paperwork, you know, paperwork and all organizational work is not really the thing that I do best. But we try very hard to balance the requirements of all the different communities that make up the UKOUG. So there is always a lot going on in terms of who needs what, how do you find out what people would really like to have, to see and so forth. Luckily, of course, because we’ve been going for so long and because we’ve built up such a large community, we actually have a dedicated office. The user group in the UK is a company. And we have an office which employs 20-25 people who are full-time employees of the company. And they spend all their time preparing events for us, organizing events, finding the places to go and doing all the work which makes the conference possible. I mean, we have our own specialists and then it’s a company and this is absolutely brilliant: they do a huge amount of work to make the whole thing work.

T: Do you think that the whole user group thing influence Oracle or any development strategies?

J: I think they can certainly have an influence. One of the things that the UK user group did each year was carry out a survey through all its members, and the membership numbers were, I think, 2500-3000 in terms of large companies and maybe 7000 or 8000 in terms of individuals. So, there is a large body of people that we can actually survey. And in the survey (it’s quite a long survey, quite a lot detailed questions and so forth) we can summarize this information and we do present it back to Oracle in the UK. And there have clearly been occasions when Oracle has been able to take the bunch of the information we’ve given them and they’ve been interested and followed some of the directions that we suggested would be helpful to our users. So, I think, because of our size, because we bring so many people together, we can act as a voice to Oracle and Oracle will listen. I suspect, they will never listen as much as we would like them to listen, and they might think they are paying far too much attention to us sometimes, but there is certainly a channel between the user group and Oracle and constructive results do come out from our channel. If a group wasn’t there, I don’t think individuals or even large companies would be able to get quite so much communication going between themselves and Oracle. So, I think, that’s another important benefit of the user group. Once it’s got enough bulk, once there are enough people, and you can turn to Oracle and say “Look. We’ve asked a thousand people and 80% of them said, they really wanted to see something like this. That’s a lot of people and that’s enough people to make Oracle think “Ok, let’s investigate that” or “Let’s try and address something in that area”.

T: That’s really great. And I think it could be quite good if the user group could share their experience of what they did, I mean, maybe the server, they get some feedback from users and they provide it to Oracle. And maybe in five years time or seven years time something is changed. I think it will be great to publish the whole results of the achievement of the user group to make other user groups continue this work.

J: Or even give them an incentive. Or let them see that something does happen. I think we probably do let people know, but I’m not sure. I mean, that’s a very good thought and maybe we don’t make it visible enough that we have this important voice from the end users. And, obviously, once we got this survey thing, we do publish summary information on the user group website about it. So, this was the broad picture we got from our survey. This is the message we passed to Oracle. But I don’t think… I’ll have to check this. I really have to check this. I don’t think we gonna have something which says, you know “a year and a half later, we sent this to Oracle, and if you will notice, Oracle has done this and this”. I think we go back and say to the users “we do make a difference”, but I don’t think we actually make it very very clear. I don’t think we document and enlist the things where we can say “well, because you told us you wanted this, look, Oracle has done it”. Perhaps, we have to make it much more obvious.

T: It sounds perfect.

J: Yeah. It helps not just us to encourage our membership, but it would be a beacon, a sign to other user groups as well.

T: Yes, and it could involve more people to the UK user group, as it’s such a strong group. That’s a very nice overview of the user groups, thank you. Continue to the other type of the group which has been formed by some people from the UK. I mean, OakTable Network. It’s a special kind of group. Could you tell us more about it?

J: Right, the OakTable network… It’s quite difficult to say what the OakTable Network is. In a way, it’s a bit like a club. We do technically have… I think it’s a trademark or it might be a copyright mark and we have some sort of a legal claim to the name “OakTable Network”, but really it’s just a, sort of, group of like-minded people and it started 10-12 years ago, I think. Mogens Nørgaard, who comes from Denmark, he is probably the focal point, if you like, of the OakTable, because he arranged a little conference in Denmark, just two or three days, I think. In fact, that might have been a conference, where I was the only speaker, that it was perhaps my first time I did my Oracle seminar, my internal seminar in Denmark. And the thing was, he said I’m gonna do this, and he started sending me mails like, well “Cary Millsap is coming over from America” and “Anjo Kolk is coming over from the Netherlands” and “Steve Adams is coming over all the way from Australia” and, of course, I was getting more and more panicked, because every famous name, every competent expert in the Oracle world seemed to be travelling to Denmark and I was supposed to stand up in front of them. We all are just enthusiasts, we all really enjoy what we are doing, and we are interested in each others’ problems and questions and so forth. It got us to a sort of stage where we were sitting around a kitchen table in Mogens’ room, and it was a huge oak table, something like more than 2 meters long, really-really big thing, you could sit 10-12 people around it very easily. So, we were sitting around it and typing in our laptops and saying “This is what happened”, and someone was trying to say: “I am trying to solve this problem” and another guy said “Let me just work on it, because I may know the answer” and it was this conversation going. And we stopped about 4 o’clock in the morning, you know. I was supposed to get up at 7 o’clock in the morning and start speaking at 9 o’clock in the morning, but at one point it was along the lines of “OK, pizza has just arrived” or “Dinner has just arrived” so we have shut the lids of the laptops, put the plates on top of the laptops, started eating, finished the dinner, open the laptops … completely mad, and it was when we were 10 or 12 years younger. But it was just a very enjoyable meeting of lots of similar-minded people and basically one of us said: “We should do this again, we should form ourselves into a club, we should make sure we do this again. We are going to be the OakTable Network”. And that was it! It started off with about 10 people, I think. But gradually over the time, individuals in the group said, you know, “I’ve been working with this person; or two or three of us has a history of working with this individual, shall we just invite them to be a member”. It’s terribly undemocratic, it’s terribly unfair, there are no doubt lots of people who could say “Well, I’m good enough, I deserve to be a member”. Well, it’s really just that people meet people and they chat and they exchange e-mail and it’s just a club for friends who happened to be very very keen in Oracle and occasionally we would just say: “Hey, let’s just invite him as well. There is nothing terribly formal about it. People tend to recognize that if you are actually a member of the OakTable, you must be pretty good at Oracle at some stage. And, of course, you are a member of the OakTable, aren’t you?

T: Yes, thanks to you!

J: Yes.

T: What do you think will be the future of this group of people, of this club?

J: I think the club will just carry on. Anyway, it’s just a private mailing list, most of the time it’s just a private mailing list. But of course there are loads of conferences all over the world, OpenWorld, for example. I can’t think how many members of the OakTable went to the OpenWorld and we had a mini OakTable Conference in OpenWorld and RMOUG, the Rocky Mountain user group, quite a lot of OakTable users go to that one, and again that’s the sort of… not quite a mini-conference, but there’s certainly lots of them get together and they set up a “big dinner” type of thing. It’s really just a friendly club, an e-mail chat room almost, just for people who are fanatic and interested in Oracle. I think that’s pretty much, that’s all it would be. I don’t think we’ll ever try and organize a takeover of a user group, organize a takeover of the Oracle corporation. It’s just a way of keeping in touch with people who are very similar-minded. The group will probably grow a little bit, but not terribly dramatic. It’s just fun.

T: Indeed. About your public activity. Some years ago, I think it was 7 years ago, you published your notes and your findings, and your script on a special website and it was something like a living document or so. You continued to change it over time. And then you went to the blog, I think it was 7 years ago.

J: About 7 years ago. Seems a long time.

T: Absolutely. Do you think it was the right idea?

J: I think so. The original website… I don’t think it was a living document. Well, occasionally I corrected mistakes that I found. And I did try and set up a Frequently Asked Questions thing, where people could post, could e-mail me a question which they needed to answer and I put it on the list and allowed anyone to write in a standardized format a response to that question. So, that was quite dynamic. The trouble with that thing was that I really didn’t get many people contributing articles, there were a few people who sent in several, but didn’t get much contribution, much feedback on that. And then I thought eventually this blog thing, dialogue is important, feedback is important, people contributing more. If you get five people coming together, the combination of what they say can often be much more important than what anyone individual can say; they can fill in the little gaps that other people leave. So, I thought the blog will be a good idea. I write something and perhaps this would prompt a few other people to say “Here is this extra little bit of information that fits” or “You haven’t asked or answered this particular detail, have you got some further comments that would address this particular part of it”. It never actually turned into that quite sort of cumulative thing. In a way, the blog is more the living document thing than a website that it was, but it never really turned into that. And in fact I have to say that one of the problems I have when people do write comments and ask questions. This particularly has come up with the addendum, the errors pages for my books. When I do ask questions, it’s very hard to find time to work out good answers to those questions, so even if in principle I like the idea of when I publish something in a blog and then we can talk about it afterwards – it’s actually very-very difficult to do the talk about it afterwards. I’ve never quite made it work. The principle is good, but it’s very hard keeping up with it. I think it is a better idea than the old website was.

T: Absolutely. And it appears that people just discuss some things in the comments, which is great.

J: Yes, from time to time when I disappear somewhere and don’t have time to look in my own blog and when I come back a week later and there are seven or eight articles going back and forth between other people saying “Yes, Jonathan, how did you so and so?” And I haven’t seen that question, but someone else’s answered. To me, that works perfectly. And, of course, I do still have a sort of overall view which says if something is actually right or something that I know is actually wrong, I can make sure it doesn’t stay there. And that’s of course one of the problems that by comparison the things like OTN and the database forums… You often see people coming there with a question and some of the responses to that question are actually very-very bad answers. And of course if someone is asking the question, they cannot see the difference between a good answer and a bad answer. And quite often you see thing like someone gives a gold star to someone and says “Great, that was really helpful, thank you very much” and I think “No, you definitely don’t want to do that!” There’s no proper mechanism for eliminating the bad stuff and reinforcing the good stuff. If you’re going to have a completely open arena for dialogue, you need a mechanism which eliminates the stuff that’s going to be unhelpful and highlights the stuff that could be helpful. And of course the problem is there is so much stuff, the sheer volume of what you need to know is so great that it really is impossible for someone to be the moderator for it, even on the OTN database forum. No one knows enough to be able to go through and clean up and tidy all these things. And the software, just as an example, the software that Oracle has currently got doesn’t allow other people to read these threads and say “No, that is a bad answer”. You actually have to contribute to this thread, you have to know something, which is: here is a question, here are three bad answers, here is one good answer, here is a comment which says: the first bad answer is a bad answer, then there’s a good answer, another bad answer, then there’s another comment which says “The second bad answer is bad” and anyone who is reading that stuff probably stops after two or three entries, because they think they’ve got an answer and they never get down to the bit which says “Well the first thing read was not the right answer”, so this information is poor information, it’s bad information, just keeps reinforcing itself, it just keeps existing, when really you’d like the bad stuff to die.

T: But it’s impossible.

J: Yes, it’s impossible. The Internet saves everything. And it’s too big job for any individual to be responsible for going and cleaning up the mess. At least on my blog, if someone puts something that is complete rubbish, I can go back and I might actually temporarily put in a reply which says “No, that’s not the right answer, because XYZ” and then, perhaps, six months later, when people will be able to examine the wrong answer and explanation why it is a wrong answer, I’ll just delete it, so that the wrong suggestion doesn’t persist. I delete the wrong answer and my response to the wrong answer. Luckily, that doesn’t happen very often, in fact. But at least the environment makes it possible for that to work.

T: I think it works great.

J: Thank you.

T: Comparing blogging and commenting on forums with something which is now considered old stuff, like old-fashioned magazines, Oracle publishes its magazine, Oracle magazine [at the time of the interview, May 2013]. UK Oracle User Group publishes magazine which is called Oracle Scene. And there are a couple of other magazines, like Northern California Oracle User Group magazine, which is quite good and available to all people around the world. Do you think it’s still relevant to publish a magazine?

J: I think that many people still like having a physical object in their hands to read, so there are quite a lot of people who will be happy to have the magazine. I think possibly though the more important feature of the magazine is that it reminds the users, the user group, that there are experts in the user group who can be helpful. That reminds people that it’s possible to have a dialogue, that it’s possible to come together in a group. If you look at Oracle’s magazine, for example, or if you look at any of those magazines, you’ve mentioned the Northern California User Group magazine, you’ve mentioned the Oracle Scene, if you look at any of them – it’s a small number of pages, it’s thirty-two pages or sixty pages. And it comes out, well, Oracle Scene, I think, 3 times a year at present, the Oracle magazine itself, I think, 4 times in a year. It’s a tiny amount of information comparing to the enormous amount you need to know. And it’s a little bit here, a little bit there. And the Oracle Scene has a section… We try to spread and cover the field, but we have a section which is database technology, we have a section which is particularly E-Business Suite, we have a section which is Developers, and if you think of all these things that Oracle has got as a product, you’ll end up it’s not quite as bad: one page is for this group of people, one page is for that group of people. You get a small number of articles, and it’s a very thin covering. So, no one gets a large amount of helpful information from the magazine, but what they do get is the sense which says “There are people out there who have this information and the user group can pull us together”. And it reminds people that it’s worth going to the user group’s meetings, it reminds people it’s worth going to a conference, where they can meet other people, and they can spend more time and learn much more by having conversations. It’s a good reminder for people, and in a user group perspective, it’s a good reminder that there is a user group, there are people who can help.

T: That’s great. Moving on to some technical questions and some interesting stuff: you are considered one of the major experts, most well-known experts in the optimizer and your book “Cost-based Oracle fundamentals” is a fascinating book, which covers some specifics of the optimizer. What do you think about the future development of this piece of software, I mean, Optimizer? What is the future of the Optimizer? How will it be developed?

J: Well, really the question is, what do I think Oracle ought to do rather than what I think Oracle is going to do.

T: Probably, yes.

J: That’s a very tricky question. There are days when I love the Oracle Optimizer and I think the optimization that Oracle does, the algorithms, it’s got the ideas built into it, the transformation capability… I think it’s absolutely fantastic. It’s brilliant! And then there are days I hate the Optimizer because of the silly little mistakes it makes, and the things that I can’t change when I know they could be changed. And what I’d really like Oracle to do is stop getting too clever. I think they are trying to do very sophisticated, very subtle things and there are a lot of little details left behind which could be corrected, could be enhanced and could be improved, but because there is a massive push, it seems, for the new features, new transformations, new sophisticated logic to the Optimizer, a lot of little details get left behind, they don’t get fixed. And one of the problems that follows on from the little details don’t get fixed is when you get to doing some sophisticated transformation, you find that if Oracle does the transformation A, some little features of the basic arithmetic say, well, the number of rows is going to be a hundred, but if you do a transformation B, some odd quirks in some other part of the code says: “Because you do it this way, the number of the rows is going to be five thousand”. And of course you get this massive capability of Oracle switching catastrophically between two different plans because of not the sophisticated bits, but because of the very simple bits, way way back in a processing, where two things that should say the same thing say different things. Now, in the talk I gave yesterday, I showed you that example which said “Hey, this is a query. All I have got is a single table and a 2-column index” and when I run the query “select where column 1 equals and column 2 equals”, here, in three consecutive versions of Oracle, are three completely different calculations as to how many rows you get and how much work you have to do. How much simpler can you get than “select from a table by index = constant”? And yet it wasn’t till 11.1 that Oracle came up with something which said: “Oh, I do actually have enough information and I have a very good estimate of both cardinality and the cost of that query”. Three phases: 10.1,, I think it was. Three completely different calculations. Now there are other oddities. Whenever you look very-very closely (because I’m solving the problems of people, I’m trying to find out why things go wrong and I often look very closely at some of the bits of the arithmetics that the Optimizer does), and it’s astonishing that whenever I look really closely at how some of these details work, I end up thinking “Well, that’s strange. This part here is not consistent with that part there” or “We’ve lost a little bit of information”. And some of these things are incredibly silly. You find little bugs where you say “Well, Oracle has obviously created an array which lists all the indexes which are relevant, but for some reason it’s ignoring the first index”. And you think about it for a bit, you rename the indexes or create a new one or drop the first one and after a while you think “Well, it looks as if what’s happened is that some program has created this array and it started counting from zero when we created it, but it started counting from one when we were using it”. And you always lose the first index! Really strange things like this can happen. Or worse still: you can get something which says: Well, we’ve got 8 indexes on this particular table here and I’m going to do a bitmap conversion of these indexes, so I work out the cost of using the first one – that one seems good, and I work out the cost of using the second one as well – and that seems good, and so does the third and the fourth one… And you look at the arithmetic and it’s not used the cost of the eight consecutive indexes, it used the cost of the first index eight times! I forget which version it was and whether it was fixed, I have to go and check that one, but you see these things, you see that someone’s forgotten to increment a counter! Incredibly simple things like that which lead to odd inconsistencies. And you can look back at this and think, well, how can something which says “should I unnest this subquery and do complex view merging at this point”, how can that possibly make the right decisions, if it can’t even work out what the cost is and how many rows it is going to get when it’s just accessing the data to start with? So I’d like to see someone going back and looking into a lot more, the details to make sure that first these things are fixed and also that if someone says what we’ve got this transformation A or transformation B on this query, why do we not get the same costing at the end of these transformations and the same estimate of cardinality? Where did the inconsistency arise? And then it might highlight some strategy for getting better results. I would like to see that. I mean, having said that, Mohamed Zait, I saw him quite recently in Switzerland with Christian Antognini, he did a presentation and one of the wonderful things he showed was how Oracle had in a very recent version gone back and reengineered star transformations. And you got a real sense from that of how incredibly difficult it is to change the Optimizer, because there are so many interwoven links and all the different little components, and by the time, I think he said it took something like two and a half years to re-engineer this, because by the time he’d finished, his designing of programs had finished, the star transformation is now something which fitted in, if I would say, orthogonally, independently of other transformations. So, you could do A, B, C and star transform or you could star transform and do A, B, C. You could do some of this business of table elimination or one of these table partition tricks, in a middle of a star transformation, or you could do a star transformation and table eliminate, so you could fit it in when it became appropriate to fit it in, whereas previously you had strange things like… If you put a subquery factoring clause (WITH subquery) and what you put in that is something that could be star-transformed, Oracle couldn’t do a star transformation, because it was inside WITH subquery and you see simple examples like this, people say “Why didn’t Oracle do XYZ?”, and the answer is “Well, because you’re doing A, XYZ is not legal”. In theory it ought to be legal, but it just happens that it’s not.

T: Or not possible yet.

J: Exactly. And this was the thing with star transformations. If it’s a WITH subquery, you cannot star transform. There is no reason why not, except you can’t. And then two-and-a-half years of engineering and you can. So it’s immensely complex to go back and make these little changes. And, of course, when you make changes, it’s a cardinality estimate changes or cost estimates changes, right? Someone’s gonna sit down and say: “Why did my query plan change? It used to be good and now it’s bad!”. Anything you do, anything that changes the Optimizer will make somebody unhappy. And the people who are unhappy will make a loud noise, the people who think it’s great and massively improved, are not going to say a thing. They are not gonna stand up and say “Oracle! Fantastic! Wonderful! Thank you very much!”. The only people who say things are the people who complain. So, Oracle Corporation cannot win on optimization. Whatever they do, they’re in the wrong. But it would be nice, I would like to see some of these funny little details cleaned up and the simple stuff, most people end up with simple stuff, I think. I think there could be some improve on that.

T: So, your thought on this is that the Optimizer will not be redesigned.

J: No.

T: Well, at one point they did redesign it and they introduced the Optimizer, Cost Based Optimizer, so you don’t think at some point in the future they will need some…

J: Start it all over again? Yeah, well, I can’t imagine a complete redesign from the ground up of the whole Optimizer and I believe there were somewhere around version 7, a big push in Oracle to do a massive redesign of some parts of Oracle, which, of course, took a lot of time, took a lot of money. I can’t really imagine any company saying: well, we’ve got this product, it works fantastically well a huge fraction of the time, but we’re going to start all over again. And we’re gonna spend a huge amount of time and a huge amount of money not making any difference to all our customers. That just doesn’t happen. I think that’s extraordinary that they spent so much time working on a star transformation bit. And I think we might see various pieces being reengineered, but I can’t quite imagine anyone from Oracle Corporation saying: we will now fund a complete new Cost-Based Optimizer. It can do so many things that it would be very hard perhaps even to write that in a specification of what it’s got to do and roughly how it’s got to do it. Incremental changes, little bit of evolution, a little bit of runaway growth, occasional bits of rationalization, but not a complete reengineering. I would be very surprised.

T: Evolution, not revolution.

J: Absolutely right, yes.

T: That’s reasonable. One of the options of Oracle is the Real Application Clusters. That’s definitely not a cheap option and lately it has become so common, especially with the introduction of the Engineered Systems, and it became very popular. So, what is your view on this option and what do you think, in the future all Oracle installation will be RAC, for example, or will we still see some single-instance databases?

J: That’s a very difficult one. I think it comes down to what Oracle would like you to buy or what rather they would like to sell to you, which is not necessarily the same thing. In a way, I think, RAC has outlived its usefulness. And there was a time when something like RAC was necessary for scalability. Oracle also used to sell it to you in terms of high availability, but some of the sites I saw it was only highly available if you got a massive team of really competent experts who day and night watched it carefully to make sure it didn’t fall over: it was too fragile, there were too many dependent little pieces which could possibly go wrong. And I think, perhaps, in part, this is why Oracle allowed RAC to go down into a Standard Edition, into a very cheap version of RAC. Because if you got fifty customers using RAC, it takes a long time to find all the bits which are broken. If you can push out a cheap version of RAC, and you’ve got ten thousands customers using RAC, you’ll get lots of profit, all your field testing goes on your cheap customers, you can figure out what you’ve got to do better. So I think there was certainly a push to move out RAC as much as possible, because it helped to develop the product, making it more stable and easier to handle and so forth. But I think – well, this is a speculation – I think with the Exadata construct, Exadata is RAC in a box, ODA is RAC in a box. Well, probably, Grid in a box rather than RAC in a box, or “cloud” in a box, I suppose, is the right term these days. The Exadata is RAC in a box, it’s using the technology which is now being field-tested, but it’s hiding the complexity by saying to the buyers “Just buy the black box”. You don’t really need to know that it’s RAC underneath. And if you get Platinum Support, you don’t even have to worry about fixes and upgrades, because Oracle will just log on to your machine and patch the whole thing, from top to bottom, just like that. And that’s the sort of marketing strategy: make Oracle a black box. The RAC-type stuff is just a part of that black box. And I think I can see the day when Oracle says: if you buying Oracle, what you’re buying is a black box that runs Oracle. Oracle is everything: it’s the whole stack from the database software through the network software down to the cluster software, down to disks, the whole thing is just Oracle, you do not need to know that it’s RAC, that it’s the high-speed network, that it’s this layer of disks. So RAC was a step towards the complete black box solution, I think that’s it. And given enough time – but it won’t be a great deal of time – I think you’d see Oracle saying “Well, we’d prefer if people did not buy a couple of big IBM boxes and link them together into a RAC system”. The only RAC systems should be a RAC system which is inside an Exadata on Oracle Appliance, I think. And if you want anything else, you can buy a single-instance Oracle and run it on a big machine and that’s it. RAC, perhaps, wasn’t seen this way in advance, but I think you can look backwards and say: “Well, RAC was important step towards Exadata, towards ODA, towards the whole cloud thing”.

T: And considering one of the chapters of your latest book. It has RAC in its name… Well… “RAC and Ruin”.

J: Well, RAC and ruin, at least to the English area, I think, it has a sort of poetic flow, it’s for the sound, it’s just a phrase that comes nicely to mind. I did explain carefully in a book that this was for alliterative effect: RAC and ruin. I was not claiming that if you got RAC, you are bound to ruin everything, OK? That wasn’t intent. It just flowed nicely as the title. And although I did actually follow up the fact that it is relatively easy to ruin things if you don’t think a little bit of how RAC is different from the single instances. And of course if you do this business, if you have got multiple instances, multiple memory images of data and lock states and all the rest of it, you’ve got to get your machines to communicate and essentially the idea with RAC then is that you minimize the communication, because you don’t want to waste time talking about what you want to do, you just want to do it. So, there was a little bit of ruin in the book, but the title is really for the sound of the title, not for the intent to mean it.

T: Yes, I understand that. And considering your clients and your consulting work, do you suggest RAC to your clients often?

J: Um… Very rarely. Hm. And this is… this is not because I have a bias against RAC.. um.. I must have said in a sense RAC it has almost had its day. I don’t think um… RAC is a strategic thing. I think if I were to say to someone: “It looks like you need a RAC solution”, I’d probably end up saying: “It looks like you need an Exadata solution”. But essentially RAC came out as a mechanism for saying: “we can’t get a machine which is big enough to do all the work we need to do, so let us tie several machines together”. Well, nowadays you have to have a horrendous, complicated, enormous workload before you need more than one machine to do it. Do something terribly simple, you know, single machine, single instance, you can do so much, meaning, the machines nowadays are just so powerful. I mean network cards you can plug in pump out stuff so fast, the SANs, if configured correctly, and if you really, really busy you.. you have your own dedicated SAN, well, the disks you can connect to can cope with so much information that you don’t need to double up machines in order to get the performance and throughput. And when it comes to high availability, 15 years ago, I think, a good high availability solution which has only gotten better, coming from your operating system, you had things which would say if this machine, machine #1, fails, we can just switch a logical machine over to the machine #2 and it’ll be up and running within minutes. Right. It gets faster and faster, of course, so just simply something which says “fail over at the operating system level and restart” – that’ the sort of resilience which is almost invariably totally adequate for most customers. Of course we don’t have things like this on virtual machines system anyway where… ahm… VMware… where where I forgot how they call it…

T: vMotion?

J: vMotion, that’s it, yes. With that type of technology you can even say, well, the machine I’m running on, my instance now at the moment, isn’t actually big enough let’s buy a new and bigger machine and oh yes next Saturday without stopping anything we’ll just slide our virtual machine from our little box into my big box. It is ridiculous what you can do with hardware these days, with the layers of software and the management stuff that can sit underneath it. So RAC … I don’t really see any great point in suggesting to someone that they need to buy RAC. I’ve never come across a client not for years where I’ve said the only sensible solution is RAC. Ok. And RAC does require some complex management, complex configuration. It is still, if you’re doing it at a “home-grown” type thing, it is still a bit fragile on the edges.

T: Yeah.

J: There are people, there are people who are very good with RAC, I mind you. I can name you half a dozen Oak|Table members who are fantastic with RAC. But for the ordinary business people it’s fragile around the edges.

T: Yeah, great. And continuing.. continuing to other development software and the upcoming next Oracle Database 12c announced on Oracle Open World, the previous, almost a year ago. And since then a lot of people started to talk about new features. And what… you heard about them… some of them

J: Yes

T: What is your most favourite?

J: New features in Oracle 12c, well, my my sources of information basically of course that I can mention are the presentations I went to at Open World, as you say last year, and, of course, when you see those presentations, the first thing Oracle puts up, the first slide is “There is no guarantee we’re going to do this. This is what we would quite like to do and what we quite might actually do.” But there is absolutely no guarantee and they call it a “safe harbor” statement. So you can see all this stuff and you don’t absolutely know if it is going to appear. The thing I think that struck me most significantly was that there are a number of enhancements with partitioning which allow you to use them in a more natural way. That let you do things which you would have thought you could perhaps have done 10 years ago, so details to do with referential integrity and how things behave you want to drop old partitions and then things happen, things cascade normally, you don’t have problems with maintaining referential integrity if you declare different partitions at level. Interestingly there’s even one little detail with partitioned tables: you can rebuild a table online. Provided what you’re actually rebuilding is a partition of a partitioned table. So you can actually say things like, “Every month I will make a partition redundancy”, so 12 months after the day this appeared I will make this partition redundant. You can actually say things like (I might get syntax wrong) “alter table move this partition into a different table space with compression as high as I would like to do it online”. So application can still coming quite happily and that data will move and indexes will be rebuilt online. And that’s a nice looking touch, an important little touch. So that’s one of the nice things, I think there are features in partitioning which will be very helpful.
The other thing, there have been changes in the way the Optimize works, particularly in terms of some of the stats collection, and especially histograms. Now, I know the histograms, I got to pay close attention to the presentation on histograms on the Open World because I was the one giving it! But histograms in 12c are much more efficient at collection and the quality of the information you get in a histogram is much better. Generally speaking I always want people to be very very cautious about creating histograms, and only create them if they really are confident that they need them and they’ll use them effectively. But it is a nice feature of 12c that the histogram definition, the strategy, the collection costs and all the rest of it: everything about histograms is much nicer in 12c. So, those are the two key things that struck me about 12c as very worthwhile improvements. Loads of other bits and pieces, bits and pieces around the edge. I mean one of the things you’ll hear about 12c of course most commonly is: Pluggable database. But it doesn’t get me excited at all.

T: Really?

J: So you’ve got pluggable databases. SQL Server had this 20 years ago or something! I exaggerate, but you know, it’s ok, so, I can appreciate that there is a benefit and most particularly there is a benefit for Oracle corporation of having pluggable databases because right now if you run up on Oracle 11 RAC with a few bits and pieces, features added on like Standby database and you look at the number of processes running on your Unix boxes there is something like 45 or 50. Well, an instance to run needs 50 processes. Well, if you’ve got 12 instances, 12 databases and you want to consolidate onto one machine. Well, you can’t actually just put 12 databases and 12 instances on one machine, because you’d have around 600 processes running on that machine before even you get your end users connected! And it has to be a very very big machine to handle even idle processes when there are 600 processes. So Oracle would of course like to put it all in one database and only run 50 processes so your sort of background process overhead, that you simple operating system housekeeping – the background processes – is much less significant. Well, actually taking an application and physically, say, well let’s take it a schema and put that schema into a database is very hard work. But if you can say, well, let’s create a container database which has 50 processes and then you can move a pluggable database in, it’s easy. In principle nothing changes. The plugged database doesn’t know it’s inside a container. It just happens that the container is the only thing that needs to run processes. So you can see from Oracle’s perspective, it’s easier for them to say, you can do consolidation with Oracle because you just put all your databases into one and you’re actually only running one database which is holding all these pluggable databases. And if it’s Oracle, it’s a conversion, say, consolidation convenience for the customer, but people who really benefits it are Oracle because it lets them sell Oracle more easily.

T: Don’t you think this is, I mean, container databases, the change of the architecture is considerable?

J: I suspect, that they don’t have to change the architecture very much. I don’t think it’s a wholly engineering job. I think there are probably a few key points where they have to make very careful changes. One of the questions I asked one of the architects, I think it was a designer when I was at Open World, I had a chat on one of the stands was, the first thing that cross my mind is this: if you have 6 copies of E-Business Suite running on 6 separate databases and you now put them to 1 database, pick a very common SQL statement. There are now 6 times as many copies of that SQL statement running as there used to be. Or rather there are 6 times as many executions of that statement. Isn’t that going to introduce higher pressure on just a few mutexes, I would have said latches just a couple of years ago, just a few mutexes because more people are running that one statement concurrently. And the basic answer to that question was well, at present our hashing algorithms are based on the SQL text. If you go to pluggable databases your hashing algorithms are based on the SQL text and the container ID. So two statements which are the same but from different plugged databases actually hash to different locations. So you don’t end up with hotspots becoming N-times as hot. So, you could imagine that there are a few critical places where Oracle says we have to bring in this one extra value, this container ID, for each of the plugged databases. But otherwise everything else is pretty much the same as it used to be. You could imagine that 😉 Oh, obviously it’s one of these sayings where someone says: “Ok, we’ve got 10 millions lines of code we only need to change 5 concepts. So it’s easy: there are only 5 things we need to change. Now can we find all the currencies of those 5 things?”And you can almost guarantee that one day someone will come along and say: If you do this and this and this and you are using this feature and you got such and such going on at the same time, you missed one. So, huge amount of work to make the change, but conceptually it could be relatively easy. Conceptually. Physically – hard work, but conceptually straightforward.

T: Have you heard about the threaded database?

J: Did I hear about?…

T: Threaded database. They did something with the way they manage processes.

J: Alright, yes, yeah.

T: And there was an article on some site which talks about the way Oracle will use processes in the 12c.

J: Right.

T: And it seem like they will do some curious stuff which is similar to shared server but dedicated (sic!)

J: I don’t think I’ve read that. One of the other thoughts that crossed my mind is that the single hottest point in any Oracle database ends up with is being the log writer. The rate at which the database can change is dependent on the log writer. Well, if you’ve got 6 busy databases with 6 busy log writers and you put them all into one container, if the container only has one log writer process, it will not keep up with 6 lots of log being generated. Therefore Oracle must have done something about making, I should say threading, some sort of threading with the log writer to, say, somehow even if we only have one log writer it has to be multiprocessing. Somehow the log writer has to work a lot faster, so perhaps one of the things we may see and perhaps the most critical and complicated bit that Oracle has to do is work out how to handle the log writer. And yes, I think I’ve seen something about, I think the expression was “threaded redo”. I can’t remember where I read it, but there was threaded redo. And something very different about what goes on with the log writer. And yes, there may be interesting things happening there. And certainly if I wanted to run up some tests, let’s say how easily I can break-it type testing, and that’s my speciality, if you want me to break something I’ll find a way of breaking it. Before it goes production. Always break it as soon as you can! Know what’s going to break in production by trying to break it first in test. It would be, I think I’d try and put pressure on the log writer to run as many databases plugged as possible which would all try to generate lots of log just to find out exactly what Oracle did in terms of handling excessive amounts of log if it’s still have got only one log writer process.

T: A question.

J: Yes.

T: How much is a “much”? You said “excessive amount” redo

J: What’s an excessive amount of redo. Yeah, oh yeah. I often point out that if someone says “if this number is large…”, large doesn’t mean anything. It doesn’t tell you anything. And the problem of course is that sometimes it’s just not possible to put an absolute number in place. You know, if you’ve got 2 CPUs and 4 Gb of memory, something you think is very big, is completely ignorable for someone else’s concern if they’ve got 168 CPUs and 320 GB of memory. So “large’ always turns into some sort of comparison between the amount of work you think you have to do and the amount of hardware you’ve got to do it with, so, you have to watch out for that one. But if you think about the log writer, the log writer basically says something like, every time you commit or every megabyte – flush – I will actually write to disk and write continuously to disk at the end of a file, occasionally switch log files to keep on going I’m just just streaming out continuously to disk. Well, if you said something like, how fast can you actually do a DD? How fast can you literally copy physically a file from one location to a different disk. What’s the fastest you can write to disk? And you might say something like, well, it’s a continuous stream, 300 megabytes per second? Maybe? So once you’ve got some SSD sitting there in the middle to help out a little bit to accumulate lots of little writes into big writes and then pump out the big writes down onto the physical disk. A figure like that.
Well, you know, if I have an application which generates 40 megabytes per second when it’s just normally running, bear in mind when I say that – the log writer – people often say we will duplex, we’ve got to make 2 copies. So that 40 megabytes of writing is actually 2 loads of 20, you know, things like that come into play. But if I’ve got something generating 40 megabytes of actual log per second, and then I say, and now I’ve got 10 copies of this application – there you go, 400 megabytes per second. Can I get that down to disk? That’s big.

T: Absolutely.

J: Big depends on what you’ve actually got to use, but you can see there are simple mechanical limits at some stage there are for it today, and you can say, if this number goes beyond X that’s what my disk can do. That’s it. That’s the fastest I can change the database.

T: I’ve seen some AWR reports on the Twitter. Which told me (sic!) some incredible numbers about redo generation. Like 600 megabytes per second or 1 gigabyte per second. This is only simple 2 socket boxes. So on 2 socket server. On a simple service which is also a part of the Exadata. But with some flash-based storage.

J: Yeah, right, that’s sort of where you’ve got to have some sort of… I mean, you don’t have disks, you have flash, you have some sort of solid state mechanism behind it. Trouble with that is that you can go on for a certain amount of time. And at some stage, and particularly if you’re archiving, archive log mode, you’ve got to de-stage from the solid state to the physically cheaper medium like spinning disks and tapes even maybe. You get your high speed from solid state, but that’s quite expensive and if you keep on doing it you’ll fill up your solid state. Eventually, you’ll run out of solid state, and this is an interesting thing about benchmarks, you engineer your benchmarks very carefully to go as fast as possible for as long as it is necessary. One of the old jokes about some of the Oracle’s benchmarks, the TPC-C things, as the benchmark says you’ve got to do this for this amount of time. You look on some of the ways in which Oracle has managed to go really really quickly and you think, yeah, that works. But if you have to carry on for another 2,5 hours it would stop. You know, you look at some of these and you think, in real life you would not implement this way because it’s not actually scalable it’s not a growing solution, it’s a fixed size solution. Some of the things Oracle does with single table hash clusters are just a little bit naughty. You know, fastest way to get into single row data is if it’s perfectly positioned in a single table hash cluster. Single table hash clusters have to be pre-declared, they are of fixed size, and if you get it wrong and you start overflowing things slow down dramatically. So…

T: You have to be clever

J: You have to be very-very well-informed. And it’s not a continuously growing solution. It’s very often benchmark special. Everyone does it.

T: But not in real life.

J: Not in real life, not in real life, no. I hope not.

T: Apart from the database Oracle is a big company business and the latest trend in this business is new word BigData and “the cloud”. What do you think about this new stuff which comes up as NoSQL databases and some applications which Oracle starts selling. Like big… like Exalytics. Or something like it.

J: I haven’t looked into this sort of area closely enough. I’d go to a few presentations at conferences and things like these. My feeling about the whole big data thing… I think it was Gwen Shapira who summed it up quite nicely. When she explained or put across the concept that you can often get very interesting information if you do something very simple to a very large amount of data. And you can get better information, better understanding than if you do something very complex to a very small amount of data. And this is, I don’t know how well it comes in big data concept, but it seemed a very sensible observation. And my feeling about the BigData business is that it doesn’t matter if the data is wrong. Or rather it’s not quite right. If you’ve got a vast amount of data and it’s not quite up-to-date and it’s not quite perfect, the sheer volume of what you’ve got that is correct means that when you do your BigData analysis the conclusions you draw from it are appropriate. So, big data doesn’t quite…

T: Statistically significant

J: … yes, exactly. What you’re after with big data is statistical significance. It’s relevance and if your data is pretty close to correct, then your statistics… and the volume is large… your statistics will overall be appropriate. Where’s the relational data is the opposite. Relational theory is a theory about insuring that you can represent data absolutely correctly or rather that you create a model and implement the model correctly, what you put into the model represents truth. It is self-consistent, it is correct within itself. And obviously you can create a model and then you can lie, you can say, ok, I can see I’ve got 10 boxes on the shelf but I’m gonna say I’ve got 8 boxes when I put in in the database. In that sense the relational theory doesn’t stop you from lying and getting the wrong answers but the relational theory says if you’re actually take the truth from the real world and put it in the correct model in the database what you’ve got in the database is correct, is true, and the database will ensure that the database is correct. And that’s why billing systems use relational databases. They do not use: “Oh, statistically, you probably owe us about, you know, 6000 roubles. Accurately you’re 6042 roubles and whatever 10 kopeek”. Is it kopeks? I’m sorry I’m trying to speak local and I should have checked currencies before I did. The relation database is correct, so it may end up that it handles – in general for most people – small data or the big data, but big data doesn’t have to be perfect; relational systems exist because sometimes the data has to be perfect. So the relational won’t go away, whatever anyone says, relational won’t go away, ‘cos relational is theoretical model. Now people may find mechanical ways of implementing it differently. The big data and NoSQL are not aimed to be implementations of the relational theory.

T: As a replacement it’s not appropriate?

J: It’s not, that is correct. There may be things that people do at present with Oracle databases which have a vast amount of data in them. And maybe they will look at them and say, you know, I’m paying for all this overhead to guarantee correctness of the data, but I don’t actually need correct data. And one of the clients I had some time back was one of the social networks. I won’t say which one, but it’s the social networks’ database which is about who said what to who and the little chat boxes and all the rest of it. Well, technically, you don’t need to make sure that you capture every little comment that every person has made to every other person. If the database crashed and you lost just a little bit of data and the database came up again, and someone said: “oh sure I told this person about my pet dog, but I can’t see it”. Well, it doesn’t really matter terribly much. And you may find out that people running things like social networks on Oracle, might say, well, let’s move to a completely different technology where I can have lots more people chatting and it’s cheaper to run the data set. And if I do lose a tiny bit, they are not gonna complain. They may be surprised, they may be puzzled, they may be a little bit confused that something’s gone missing. But, you know, as long as this can go pretty close, they won’t complain. Maybe if something appears which disappeared for 24 hours because it was over on a different node and it had a funny feel today with this node and it cleaned itself up a bit later – fine.

T: And there is redundancy also.

J: Exactly! The big data thing, the HDFS, this type of stuff allows for, you know, the data protects itself. It’s the sort of internet database, you know, you can destroy bits of it, but the rest of it will work around the hole in the middle and fill in the gaps type of thing. And that’s very much that sort of, it’s my mind anyway, that sort of feel to it, to the big data thing. It gets there, it gets there in the end. Right, and for some systems it’s perfectly OK to say it’s not perfect all the time but it’s good enough.

T: And another word which comes very often with BigData is the Cloud. And every person thinks different, I think, about the Cloud.

J: I’m not convinced that there is any direct connection between BigData and the Cloud. To me, when people say “the Cloud” my mind goes all the way back to about 1968-1969, when PDP 11 was top of the range, absolutely the most expensive fantastically powerful computer. And people used to set up computing bureaus where you got on to a telephone line, and you’d whistle down the telephone line and the computer understand it understood the whistling. It was bureau computing. The Cloud to me is just a very modern equivalent of that. We don’t really need to know where the resources are. Things like vMotion let us move virtual computers from physical machine to physical machine without even stopping. Well, the Cloud simply says, I’ve got a load of computers here that are very closely tied together, I can create a virtual computer out of many of those real computers or even blades or bits in a single box that I need. Oh, and I don’t really care whether it’s one smallish box sitting in my server room or whether I actually slide the whole virtual machine over the Internet into the enormous great box which is sitting in Oracle shores. To me that’s what the Cloud is, it’s a way of saying: “ I need computer resources, I don’t actually have to own them”. And it just happens the technology has just moved on so much, things like the VMWare type stuff, you look at things like RAC, RAC then went into Grid, where Grid said you’ll have a big RAC cluster, like we’ll have 14 old RAC clusters, but 8 of those nodes will be my HR applications and 12 of those nodes will be my general ledger, and 22 of those nodes, if I’ve got enough left, will be my web application server type of thing. Oh, and on Saturdays the thing gets a bit busy, so it takes some of the HR nodes off the HR cluster and a few off the GL cluster and move them to web cluster, and on Sunday night we’ll shift them back again. So this was just Oracle moving nodes across clusters. Across the virtual database and into the right clusters. Well, Cloud is just more of the same. We can move the resources where we need them. Or put it another way around. If you want to buy the resources off me you don’t need to know exactly where I’ve put those resources. So, it’s nothing new and exciting, it’s just bigger. Bigger and necessarily a lot faster. But it’s just selling computer time. And we now have much better technology for selling much more computer time. And you can start small. You can start… and this is Oracle in ODA. You can start with one little ODA sitting in your office and there’s a nice little sort of management interface and you can set it up so end users can say, you know, I’m allowed to ask for 6 CPUs if I need them. You know, run up the interface which is, you know, build me a virtual system, build me a virtual web application server with database behind it. With just a few clicks of a button. That little box can migrate its virtual systems across the network and you can put them in a big box or you can put them out onto the Internet and onto a third party supplier. Well, third party supplier means Oracle Corporation ideally. That’s, you know.

T: Thank you very much for this wonderful overview of this new technology. And…

J: I should put up on these slides under red lights: “I might be completely wrong. I might change my mind later. That’s just opinions over here. I’m not a technology forecasting guru. I could be quite off base on that sort of line of business”.

T: Still, you sound reasonable. And I think you are… can’t be mistaken too much.

J: I would not.

T: You are a book writer.

J: I have written books, yes.

T: So far it’s 3 books, I think.

J: Yes, 3 books I’ve written and 3 more I have contributed to.

T: Yes, and can you share us your experience how difficult it is to write a book.

J: Extremely difficult! My view on the books I’ve written, the thing I was trying to achieve by writing books, was to tell a story. You know, you look at the Oracle manuals and there is a huge amount of very useful information in them. And particularly over the last 3 or 4 years. I think, possibly the best two Oracle’s 10.2 manuals, but over the last years the quality of information is very very good. But in many ways it’s fragmented. You have to know which bit you have to look at to understand what you need to do. So, what I wanted to do in all of my books was try and pull all the things together into a continuous story so you can begin at the beginning and walk through that and gradually learn and make more complex and more advanced your understanding of how Oracle worked. So my aim is always to have a narrative. I can tell you 5000 little things about Oracle and it wouldn’t help you very much unless there were a couple of them you really needed to know. Lots of little facts in my mind is not helpful. I wanted to tell the whole story. From end to end. And that’s extremely difficult to start with. I’m actually structuring story so that you can produce the flow.
The other thing was I know a huge number of pieces about how Oracle works. And I have a huge amount of information which allows me to look at a specific problem and say, “All right, I can identify the cause of this problem is approximately related to this particular feature or this particular experience I’ve had before or this bit of implementation which I’ve looked at very closely”. But when I’m on site I can not say, well, I know roughly what’s going on, now that lets me build a little model or look a little more closely at what is actually happening in this specific case. And so I often know, say, 98% of what’s happening. And for a client on individual cases I need to work out last 2%. Well, that’s great when you’re writing a book. Nearly, ‘cos you can write 10 pages and just write 10 pages and do 10 pages perhaps in a day as you go through the basic descriptions. And then you get to the part where you think: “Oh, I’ve got one of these 2% points. I can’t just write this sentence, because I need to make sure I actually understand, or at least point out, the different things that could happen at this point”. And so suddenly, you sit there thinking: “I want to write 2 lines. That’s what I want to write: 2 lines. But I’ve got to check what actually happens”. And then you spend the next 2 days setting up little models and then just checking on if there are variations. And if those variations appear and all the rest of it, and then you go back and say, in those 2 lines, however, remember that if you do so and so this doesn’t quite work the same way. And you will see this. And you spend 2 days writing 2 lines and you spend 1 day writing 10 pages. It’s the Pareto, the 80-20 rule. It is like that: “You write 80% of the book in 20% of the time, and you write 20% of the book in 80% of the time”. It’s much worse than that. You write 95% of the book in 5% of the time and the last little bit of the book takes forever. And you have to do it, because you’ve got to get it as right as you can. Because you can be so misleading if you say something like: “Oh, yes, I’ve seen that happen once before, I’ll tell everyone that that’s how it works”. And of course that’s how it might work in 80% of the cases. And you’ve completely overlooked that lots of people do something a bit different, you haven’t catered for. And you can see this in lots of Oracle books. That really they’re just collections of odds and ends that people have noticed in the past or already saw on the Internet and they stick them all together into a book. And you read through some of the stuff and you think: “Well, yeah. You have described a fairly common case, but you haven’t allowed for…. Here are 10 common variations on that thing. Where is the thing which says “do not do this if you’re doing xyz”. Or “it actually costs twice as much if you switch on auditing or if you switch on a Standby database. And you look on some of the experiments that people run, their little models… people are getting a lot better at this, and it’s been proven hugely over last 5 to 10 years, that people would actually say: “Here is a model that actually shows you what happens, you can run it yourself and see it happen”. But it’s quite difficult to do that well. And building a sensible model requires a great deal of skills. And it’s very easy, no matter how good you are, it’s very easy to make mistakes. Mainly mistakes of things you didn’t think of checking. I’ve actually got a presentation which lasts about an hour and a half, “How to test”. And one of the things is, well, you can do this, that’s very simple, it gets you started, but you haven’t done this so we’ll make the test more complicated, you haven’t allowed for this, so we’ll make the test more complicated. And you get through to the end. But eventually you’ll have something which says: “I really need to model 25 individual sessions connecting to the database or coming through a pool to the database and inserting one row and committing and disappearing again. Because there are so many little optimizations, you go from something like “create table as select” to “I’ve got 25 concurrent sessions at any one instant randomly inserting data into this table one row at a time”.
You can validly make some comments about the data and how Oracle does things with that data on Create table with select. But actually, you do get to the cases where you need to say: “that’s not a valid model for a certain type of test”. And instead of doing something which generates 6 million rows in 5 minutes I have to run something which goes concurrently to generate 6 million rows in 45 minutes. Before I actually start doing my test case, my real test. And you can see this in lots of encounters that people have gone through the simpler cases and yet there are still more. Have you allowed for this, have you allowed for that? The point comes as you progress through the test as you progress through checking to see whether a concept could work in production, and this is where it becomes critical, of course, I said something about breaking things as quickly as possible. If you got to produce a new piece of code, if you got to implement a new bit of the application, you need to choose which Oracle features you are going to apply. You have to know how they work, you have to test them. And very often you can say, well, single table hash clusters, that looks like a really brilliant idea, now let’s start testing. And the idea is to say, well, what’s the simplest test I can do that proves it’s a bad idea? And you just go through that stage. If I’ve got 5 ideas, what’s the simplest thing I can do to eliminate 4 of those ideas and not waste time chasing a bad idea. So you go from a very simple starting test in the hope that things will go wrong very early on. But eventually you do have to get to a very good test. You have to get to something close to realistic and the closer you get to realism, the more time and effort it takes.
I can’t remember what the original question was, but I think I’ve probably said something interesting and useful… I hope I’ve said something interesting and useful there.

T: Absolutely. The question was about the books.

J: Oh, yeah. Yes, it was the books. You end up going to some of the really complicated tests and spending a lot of time. That’s where the problem comes with the book. You know what the basic idea is, but sometimes you gonna have to spend a lot of time just to finish off a couple of sentences or one paragraph. And yes I often found that I will get to the point where I’d say, OK, what I want to write is a paragraph and 2 days later I could write that paragraph. Or sometimes not write that paragraph and write a completely different paragraph, because I’ve discovered something new and different, because I was creating more realistic tests. And the effort to get the last 5% is enormous. Occasionally it’s very frustrating.

T: Thank you. Any plans, maybe, for more books?

J: Not very strong ones. People do ask me from time to time. Well, there was this suggestion from the CBO book, the Cost-Based Optimizer Book that there would be 2 more volumes. I think I’ve said there might be, or I have enough material for 2 more volumes, but problem is. This is why Oracle should go back and fix the old optimizer not bring in new bits! Then I can keep up with them and actually write a book about what happens. They change the optimizer faster than I can write about it. Plus of course there are a number of really highly competent people now in the Internet, who actually do write stuff about the optimizer, so it’s a lot easier for people to find out some of the stuff they need. I mean, my book gives you the basics, it gives you the core features that you need to know to recognize the sort of basic problems. The later books were about some of the more advanced stuff, some of the less common stuff. So I’ve decided there’s no way I can actually catch up and do, put all the rest things. Instead of being the 5% testing that I have to do for the new bits, it would be, half of it would be, like 50% of it would be work at it and do all the tests. But I do have a thought that time when I wrote Oracle Core and I sit there I think it’s roughly the first 2 chapters of Practical Oracle 8i revisited and made into a whole book. There is scope for a book called Oracle Structures . I’ve actually designed a seminar, a 2 days seminar for Oracle structures which is pretty much the rest of Practical Oracle 8i but brought very much up to date. Because, well, you look at most database systems and you have simple headp tables, occasionally people use partitioning, of course, the big stuff, and B-Tree indexes or if it’s a data warehouse you have bitmap indexes. And you can engineer a lot of performance into a system if you know what physical structures are most appropriate for what you want to do. And that book is a possible book and I do have odd little thoughts and odd little notes about, well, I would quite like to write that book. And perhaps I could write it without having to do the 50% maybe it would still be a 95-5 type of job. So, maybe, maybe, but no promises! There may be a book Oracle Structures.

T: I hope so. And the closing question. Tell us your favourite story about your consulting, like, the best thing you did to a client.

J: I think the best story I can tell is the story that makes you realize how nobody is really, how should I put this, what you think is the best thing you’ve ever done isn’t what people want you for. Cause the best thing I ever did, or rather the thing I did at one company which I think has got me the loudest, wildest, most enthusiastic praise was to fix the coffee machine!

T: Oh, really?

J: Absolutely. And they called me because they had performance problems. That’s why I ever go anywhere: “we have performance [problems]…” or actually some people call me and say, well, could you do a day briefing on Oracle to the development team or to the DBAs or whatever. So sometimes my visits to companies are actually about training days and talking about Oracle. But mostly it’s “fix our performance problems” or at least tell us about how to fix our performance problem. So I went to this company, quite a long time ago now and they had some really unpleasant SQL, and it was really really quite difficult to work out what it’s trying to do. It would be nice at that times to have all these little scripts that Carlos [Sierra] has got into his SQLT, cause you need to collect a huge amount of information about what they were trying to archive with that SQL, to understand what it was doing. DBAs didn’t know what the business was doing, we had to reverse engineer from the SQL to what the requirement was. Complicated stuff I picked up the worst one they had, quite a long it was 3 or 4 pages. Some people will say that’s nothing, but it was difficult. It took me 2 or 3 hours to figure out what it was doing, find out what the logical flaws were in the way it was doing, and write a piece of SQL that did the same thing much much faster. It was a classic 2 hours drop down to 3 minutes sort of fix. So I got this done and I called the DBAs and said OK, here’s the query, this is what it’s trying to do, this is why this version of it went badly, here is my version of it, this is why it’s actually logically the same query, will give you the same results, watch it run: BANG, 3 minutes. And the DBA says: “Yeah, fine, good, thanks”. And off they went. And I thought, oh, I deserve a cup of coffee. I’ve done a really good job here, I’m gonna make some coffee. I went to the kitchen, they had a wonderful espresso machine in the kitchen, and I put the cup on it, pressed the button and it went KADUNK and a red light came up. So I got a bit annoyed. So I took the machine apart and I discovered that it’s made up of 4 pieces really and I put them back as they were. So I took the machine apart, I checked there was coffee in it, I checked if there was water in it, I checked the reject container was empty, took it apart, put it back together again, pressed the button, and the machine went chchchchchchc and all this noise of espresso coming out of the machine. The DBAs are going rushing into the kitchen: “WOW YOU FIXED THAT COFFEE MACHINE!!!”. I thought, yeah, great they are paying me to fix their database but what they really care about? I fixed the coffee machine. And as someone once said, yes of course the coffee machine is the most important piece of kit in the IT department. But yeah, the best thing I ever did for a company, the thing they appreciated most, fixed the coffee machine. They were probably expecting me to fix their database, but fixing the coffee machine, that’s my proudest moment, that’s my happiest moment, shall we say. Obviously, I fixed the SQL and re-engineered databases, I’ve had great fun explaining things to people. And it’s very satisfying when you explain to someone why this thing works badly, and they say, “Ah! Right! OK, oh I can do…” You give them an idea and they say, “I’ve got 6 or 7 places I’m doing the same thing”. It’s very gratifying when you can do that. But the sheer excitement I recall from fixing a coffee machine is ridiculous. It’s a memory which will not go away.

T: So I think we can stop here. I’d like to thank you very much for this exciting talk!

J: My pleasure!

One Trackback

  1. By Interview | Oracle Scratchpad on February 9, 2014 at 14:47

    […] If you haven’t got the time to listen to the recording, there’s now a transcript online. […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s